What's more general than a whole population?
Bottom Line:
Statistical inference is commonly said to be inapplicable to complete population studies, such as censuses, due to the absence of sampling variability.With reference to the social science literature, the current paper explores the circumstances under which statistical inference can be meaningful for such studies.It concludes that its use implicitly requires a target population which is wider than the whole population studied - for example future cases, or a supranational geographic region - and that the validity of such statistical analysis depends on the generalizability of the whole to the target population.
View Article:
PubMed Central - PubMed
Affiliation: MRC Tropical Epidemiology Group, London School of Hygiene and Tropical Medicine, Keppel Street, London, UK.
ABSTRACT
Statistical inference is commonly said to be inapplicable to complete population studies, such as censuses, due to the absence of sampling variability. Nevertheless, in recent years, studies of whole populations, e.g., all cases of a certain cancer in a given country, have become more common, and often report p values and confidence intervals regardless of such concerns. With reference to the social science literature, the current paper explores the circumstances under which statistical inference can be meaningful for such studies. It concludes that its use implicitly requires a target population which is wider than the whole population studied - for example future cases, or a supranational geographic region - and that the validity of such statistical analysis depends on the generalizability of the whole to the target population. No MeSH data available. Related in: MedlinePlus |
Related In:
Results -
Collection
License 1 - License 2 getmorefigures.php?uid=PMC4549103&req=5
Mentions: Classical frequentist statistics relies on the notion of a sampling population to justify probability statements such as p values or confidence interval coverage. A sampling population is a source of variation over putative repeated samples, as illustrated in Fig. 1. For such variation to occur, the sample must be smaller than the population, otherwise ‘sampling errors disappear altogether’ [2] (p659) and p values tend to zero. This is formalized in the finite population adjustment, which reduces standard errors towards zero as the size of the sample approaches that of the population [3] (p436). This adjustment is rarely used because, in the classical framework, the sample is much smaller than the population. Sometimes this inequality is simply asserted — e.g., ‘We cannot study all the population’ [4]. However, advances in computerization of health information, for example through national cancer registries [5] and mass genotyping [6], have made it more feasible to study groups which can reasonably be called ‘whole populations’. Although missing data can rarely, if ever, be ruled out, some studies, for example based on cancer registries, have achieved very low levels [7, 8]. The social sciences recognise repeated sampling to be inapplicable to certain kinds of whole population studies [9]. For example, when studying characteristics of the ten largest cities in a given country, based on data aggregated from the latest national census, re-doing the study would not subject the data to sampling variation. To understand how statistical inferential might, nevertheless, be applicable to whole population studies, we need to distinguish different uses of the word ‘population’.Fig. 1 |
View Article: PubMed Central - PubMed
Affiliation: MRC Tropical Epidemiology Group, London School of Hygiene and Tropical Medicine, Keppel Street, London, UK.
No MeSH data available.