Limits...
Using Benford's law to investigate Natural Hazard dataset homogeneity.

Joannes-Boyau R, Bodin T, Scheffers A, Sambridge M, May SM - Sci Rep (2015)

Bottom Line: We have found that, while the first-digit distribution for the entire record follows Benford's Law prediction, specific changes such as satellite detection have had serious impacts on the dataset.The least-square misfit measure is used as a proxy to observe temporal variations, allowing us to assess data quality and homogeneity over the entire record, and at the same time over specific periods.Such information is crucial when running climatic models and Benford's Law could potentially be used to overcome and correct for data heterogeneity and/or to select the most appropriate part of the record for detailed studies.

View Article: PubMed Central - PubMed

Affiliation: Southern Cross GeoScience, Southern Cross University, Lismore, NSW, 2480, Australia.

ABSTRACT
Working with a large temporal dataset spanning several decades often represents a challenging task, especially when the record is heterogeneous and incomplete. The use of statistical laws could potentially overcome these problems. Here we apply Benford's Law (also called the "First-Digit Law") to the traveled distances of tropical cyclones since 1842. The record of tropical cyclones has been extensively impacted by improvements in detection capabilities over the past decades. We have found that, while the first-digit distribution for the entire record follows Benford's Law prediction, specific changes such as satellite detection have had serious impacts on the dataset. The least-square misfit measure is used as a proxy to observe temporal variations, allowing us to assess data quality and homogeneity over the entire record, and at the same time over specific periods. Such information is crucial when running climatic models and Benford's Law could potentially be used to overcome and correct for data heterogeneity and/or to select the most appropriate part of the record for detailed studies.

No MeSH data available.


Related in: MedlinePlus

Temporal variations of categories distance travelled by TC.Frequency of TC occurrences relative to the category of distance traveled: (i) short (<1000 kms), (ii) medium (1000 kms < × < 5000 kms) and (iii) long (>5000 kms) (plain curve correspond to the 5 year running mean (5YRM)).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4496784&req=5

f3: Temporal variations of categories distance travelled by TC.Frequency of TC occurrences relative to the category of distance traveled: (i) short (<1000 kms), (ii) medium (1000 kms < × < 5000 kms) and (iii) long (>5000 kms) (plain curve correspond to the 5 year running mean (5YRM)).

Mentions: In Fig. 3 we have plotted the evolution of TC tracks over time in three categories of distances traveled: (i) short (<1,000 km), (ii) medium (1,000 km < × < 5,000 km), and (iii) long (>5,000 km). Over time, there has been a change in traveled distances, with a continuous increase in large distances traveled by TCs between 1930 and 2010. Most importantly, a severe and sudden shift occurred in the 1970s between short and medium distances. The overall shift after 1970 is also visible in Fig. 2 as a clear increase in the average distance traveled.


Using Benford's law to investigate Natural Hazard dataset homogeneity.

Joannes-Boyau R, Bodin T, Scheffers A, Sambridge M, May SM - Sci Rep (2015)

Temporal variations of categories distance travelled by TC.Frequency of TC occurrences relative to the category of distance traveled: (i) short (<1000 kms), (ii) medium (1000 kms < × < 5000 kms) and (iii) long (>5000 kms) (plain curve correspond to the 5 year running mean (5YRM)).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4496784&req=5

f3: Temporal variations of categories distance travelled by TC.Frequency of TC occurrences relative to the category of distance traveled: (i) short (<1000 kms), (ii) medium (1000 kms < × < 5000 kms) and (iii) long (>5000 kms) (plain curve correspond to the 5 year running mean (5YRM)).
Mentions: In Fig. 3 we have plotted the evolution of TC tracks over time in three categories of distances traveled: (i) short (<1,000 km), (ii) medium (1,000 km < × < 5,000 km), and (iii) long (>5,000 km). Over time, there has been a change in traveled distances, with a continuous increase in large distances traveled by TCs between 1930 and 2010. Most importantly, a severe and sudden shift occurred in the 1970s between short and medium distances. The overall shift after 1970 is also visible in Fig. 2 as a clear increase in the average distance traveled.

Bottom Line: We have found that, while the first-digit distribution for the entire record follows Benford's Law prediction, specific changes such as satellite detection have had serious impacts on the dataset.The least-square misfit measure is used as a proxy to observe temporal variations, allowing us to assess data quality and homogeneity over the entire record, and at the same time over specific periods.Such information is crucial when running climatic models and Benford's Law could potentially be used to overcome and correct for data heterogeneity and/or to select the most appropriate part of the record for detailed studies.

View Article: PubMed Central - PubMed

Affiliation: Southern Cross GeoScience, Southern Cross University, Lismore, NSW, 2480, Australia.

ABSTRACT
Working with a large temporal dataset spanning several decades often represents a challenging task, especially when the record is heterogeneous and incomplete. The use of statistical laws could potentially overcome these problems. Here we apply Benford's Law (also called the "First-Digit Law") to the traveled distances of tropical cyclones since 1842. The record of tropical cyclones has been extensively impacted by improvements in detection capabilities over the past decades. We have found that, while the first-digit distribution for the entire record follows Benford's Law prediction, specific changes such as satellite detection have had serious impacts on the dataset. The least-square misfit measure is used as a proxy to observe temporal variations, allowing us to assess data quality and homogeneity over the entire record, and at the same time over specific periods. Such information is crucial when running climatic models and Benford's Law could potentially be used to overcome and correct for data heterogeneity and/or to select the most appropriate part of the record for detailed studies.

No MeSH data available.


Related in: MedlinePlus