Limits...
Using Benford's law to investigate Natural Hazard dataset homogeneity.

Joannes-Boyau R, Bodin T, Scheffers A, Sambridge M, May SM - Sci Rep (2015)

Bottom Line: We have found that, while the first-digit distribution for the entire record follows Benford's Law prediction, specific changes such as satellite detection have had serious impacts on the dataset.The least-square misfit measure is used as a proxy to observe temporal variations, allowing us to assess data quality and homogeneity over the entire record, and at the same time over specific periods.Such information is crucial when running climatic models and Benford's Law could potentially be used to overcome and correct for data heterogeneity and/or to select the most appropriate part of the record for detailed studies.

View Article: PubMed Central - PubMed

Affiliation: Southern Cross GeoScience, Southern Cross University, Lismore, NSW, 2480, Australia.

ABSTRACT
Working with a large temporal dataset spanning several decades often represents a challenging task, especially when the record is heterogeneous and incomplete. The use of statistical laws could potentially overcome these problems. Here we apply Benford's Law (also called the "First-Digit Law") to the traveled distances of tropical cyclones since 1842. The record of tropical cyclones has been extensively impacted by improvements in detection capabilities over the past decades. We have found that, while the first-digit distribution for the entire record follows Benford's Law prediction, specific changes such as satellite detection have had serious impacts on the dataset. The least-square misfit measure is used as a proxy to observe temporal variations, allowing us to assess data quality and homogeneity over the entire record, and at the same time over specific periods. Such information is crucial when running climatic models and Benford's Law could potentially be used to overcome and correct for data heterogeneity and/or to select the most appropriate part of the record for detailed studies.

No MeSH data available.


Related in: MedlinePlus

Distribution of TC travelled distances and frequency since 1842 to present.The 1930’s represent a significant improvement in the recording and measurements of tropical cyclone occurrences. Tropical cyclone travelled distances (km) from 1841 to 2010; red curve is the 5 years running mean distance (km).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
getmorefigures.php?uid=PMC4496784&req=5

f2: Distribution of TC travelled distances and frequency since 1842 to present.The 1930’s represent a significant improvement in the recording and measurements of tropical cyclone occurrences. Tropical cyclone travelled distances (km) from 1841 to 2010; red curve is the 5 years running mean distance (km).

Mentions: The distance traveled by each TC was plotted against the year of occurrence (Fig. 2). The number of events increases with time, most likely due to improvements in scientific communications and observational capabilities. For example, only one cyclone track appears in the dataset for 1842 compared to 92 in 1900 and 297 in 1970. We also note that no TC tracks were reported along the Western Pacific coast in the early records (Figs 1 and 2). The minimum and maximum distances traveled in the dataset are 1.2 km and 18,947 km, respectively, spanning over four orders of magnitude. The average distance traveled by cyclones over the complete dataset is 2,560 km but changes from 1,796 km to 2,866 km prior to and after 1931, respectively.


Using Benford's law to investigate Natural Hazard dataset homogeneity.

Joannes-Boyau R, Bodin T, Scheffers A, Sambridge M, May SM - Sci Rep (2015)

Distribution of TC travelled distances and frequency since 1842 to present.The 1930’s represent a significant improvement in the recording and measurements of tropical cyclone occurrences. Tropical cyclone travelled distances (km) from 1841 to 2010; red curve is the 5 years running mean distance (km).
© Copyright Policy - open-access
Related In: Results  -  Collection

License
Show All Figures
getmorefigures.php?uid=PMC4496784&req=5

f2: Distribution of TC travelled distances and frequency since 1842 to present.The 1930’s represent a significant improvement in the recording and measurements of tropical cyclone occurrences. Tropical cyclone travelled distances (km) from 1841 to 2010; red curve is the 5 years running mean distance (km).
Mentions: The distance traveled by each TC was plotted against the year of occurrence (Fig. 2). The number of events increases with time, most likely due to improvements in scientific communications and observational capabilities. For example, only one cyclone track appears in the dataset for 1842 compared to 92 in 1900 and 297 in 1970. We also note that no TC tracks were reported along the Western Pacific coast in the early records (Figs 1 and 2). The minimum and maximum distances traveled in the dataset are 1.2 km and 18,947 km, respectively, spanning over four orders of magnitude. The average distance traveled by cyclones over the complete dataset is 2,560 km but changes from 1,796 km to 2,866 km prior to and after 1931, respectively.

Bottom Line: We have found that, while the first-digit distribution for the entire record follows Benford's Law prediction, specific changes such as satellite detection have had serious impacts on the dataset.The least-square misfit measure is used as a proxy to observe temporal variations, allowing us to assess data quality and homogeneity over the entire record, and at the same time over specific periods.Such information is crucial when running climatic models and Benford's Law could potentially be used to overcome and correct for data heterogeneity and/or to select the most appropriate part of the record for detailed studies.

View Article: PubMed Central - PubMed

Affiliation: Southern Cross GeoScience, Southern Cross University, Lismore, NSW, 2480, Australia.

ABSTRACT
Working with a large temporal dataset spanning several decades often represents a challenging task, especially when the record is heterogeneous and incomplete. The use of statistical laws could potentially overcome these problems. Here we apply Benford's Law (also called the "First-Digit Law") to the traveled distances of tropical cyclones since 1842. The record of tropical cyclones has been extensively impacted by improvements in detection capabilities over the past decades. We have found that, while the first-digit distribution for the entire record follows Benford's Law prediction, specific changes such as satellite detection have had serious impacts on the dataset. The least-square misfit measure is used as a proxy to observe temporal variations, allowing us to assess data quality and homogeneity over the entire record, and at the same time over specific periods. Such information is crucial when running climatic models and Benford's Law could potentially be used to overcome and correct for data heterogeneity and/or to select the most appropriate part of the record for detailed studies.

No MeSH data available.


Related in: MedlinePlus