Limits...
Menage a quoi? Optimal number of peer reviewers.

Snell RR - PLoS ONE (2015)

Bottom Line: Here I use bootstrapping of replicated peer review data from a Post-doctoral Fellowships competition to show that five reviewers per application represents a practical optimum which avoids large random effects evident when fewer reviewers are used, a point where additional reviewers at increasing cost provides only diminishing incremental gains in chance-corrected consistency of decision outcomes.Random effects were most evident in the relative mid-range of competitiveness.Results support aggressive high- and low-end stratification or triaging of applications for subsequent stages of review, with the proportion and set of mid-range submissions to be retained for further consideration being dependent on overall success rate.

View Article: PubMed Central - PubMed

Affiliation: Research Knowledge Translation and Ethics Portfolio, Canadian Institutes of Health Research, Ottawa, Ontario, Canada.

ABSTRACT
Peer review represents the primary mechanism used by funding agencies to allocate financial support and by journals to select manuscripts for publication, yet recent Cochrane reviews determined literature on peer review best practice is sparse. Key to improving the process are reduction of inherent vulnerability to high degree of randomness and, from an economic perspective, limiting both the substantial indirect costs related to reviewer time invested and direct administrative costs to funding agencies, publishers and research institutions. Use of additional reviewers per application may increase reliability and decision consistency, but adds to overall cost and burden. The optimal number of reviewers per application, while not known, is thought to vary with accuracy of judges or evaluation methods. Here I use bootstrapping of replicated peer review data from a Post-doctoral Fellowships competition to show that five reviewers per application represents a practical optimum which avoids large random effects evident when fewer reviewers are used, a point where additional reviewers at increasing cost provides only diminishing incremental gains in chance-corrected consistency of decision outcomes. Random effects were most evident in the relative mid-range of competitiveness. Results support aggressive high- and low-end stratification or triaging of applications for subsequent stages of review, with the proportion and set of mid-range submissions to be retained for further consideration being dependent on overall success rate.

Show MeSH

Related in: MedlinePlus

First derivative (S1) and second derivative (S2) of kappa, within incremental N to N+1 reviewers, across five overall competition success rate scenarios.Relative improvement of kappa reached stability at 4–5 reviewers per application or shortly thereafter, across all overall success rate scenarios [5% (a), 15% (b), 25% (c), 35% (d), and 50% (e)]. Vertical dashed lines represent the approximate S2 asymptotes.
© Copyright Policy
Related In: Results  -  Collection

License 1 - License 2
getmorefigures.php?uid=PMC4382286&req=5

pone.0120838.g005: First derivative (S1) and second derivative (S2) of kappa, within incremental N to N+1 reviewers, across five overall competition success rate scenarios.Relative improvement of kappa reached stability at 4–5 reviewers per application or shortly thereafter, across all overall success rate scenarios [5% (a), 15% (b), 25% (c), 35% (d), and 50% (e)]. Vertical dashed lines represent the approximate S2 asymptotes.

Mentions: To identify a meaningful (and practical) upper limit to Nreviewers, several criterion—setting stopping—approaches [37] were tried (e.g., Scree plot methods used to identify a meaningful number of Principal Components). The most pragmatic (or otherwise useful) stopping—approach was based on a method to select sampling duration using second derivatives [38]. The first derivative local slope (S1) calculated from point—to—point local changes in kappa and second derivative local change in slope (S2) similarly calculated from point—to—point changes in S1, are shown (Fig. 5). Buffin—Bélanger & Roy [38] used natural logarithm transformation of the x—axis (time, covering many orders of magnitude), contributing to their distinct S2 inflection point. S2 of non-transformed kappa data in this study attained an asymptote at, or slightly, above the 4–5 reviewer increment in all success rate scenarios. Substantial levels of decision consistency (kappa ≥ 0.61) were achieved with 4–5 reviewers per application, for all success rate scenarios.


Menage a quoi? Optimal number of peer reviewers.

Snell RR - PLoS ONE (2015)

First derivative (S1) and second derivative (S2) of kappa, within incremental N to N+1 reviewers, across five overall competition success rate scenarios.Relative improvement of kappa reached stability at 4–5 reviewers per application or shortly thereafter, across all overall success rate scenarios [5% (a), 15% (b), 25% (c), 35% (d), and 50% (e)]. Vertical dashed lines represent the approximate S2 asymptotes.
© Copyright Policy
Related In: Results  -  Collection

License 1 - License 2
Show All Figures
getmorefigures.php?uid=PMC4382286&req=5

pone.0120838.g005: First derivative (S1) and second derivative (S2) of kappa, within incremental N to N+1 reviewers, across five overall competition success rate scenarios.Relative improvement of kappa reached stability at 4–5 reviewers per application or shortly thereafter, across all overall success rate scenarios [5% (a), 15% (b), 25% (c), 35% (d), and 50% (e)]. Vertical dashed lines represent the approximate S2 asymptotes.
Mentions: To identify a meaningful (and practical) upper limit to Nreviewers, several criterion—setting stopping—approaches [37] were tried (e.g., Scree plot methods used to identify a meaningful number of Principal Components). The most pragmatic (or otherwise useful) stopping—approach was based on a method to select sampling duration using second derivatives [38]. The first derivative local slope (S1) calculated from point—to—point local changes in kappa and second derivative local change in slope (S2) similarly calculated from point—to—point changes in S1, are shown (Fig. 5). Buffin—Bélanger & Roy [38] used natural logarithm transformation of the x—axis (time, covering many orders of magnitude), contributing to their distinct S2 inflection point. S2 of non-transformed kappa data in this study attained an asymptote at, or slightly, above the 4–5 reviewer increment in all success rate scenarios. Substantial levels of decision consistency (kappa ≥ 0.61) were achieved with 4–5 reviewers per application, for all success rate scenarios.

Bottom Line: Here I use bootstrapping of replicated peer review data from a Post-doctoral Fellowships competition to show that five reviewers per application represents a practical optimum which avoids large random effects evident when fewer reviewers are used, a point where additional reviewers at increasing cost provides only diminishing incremental gains in chance-corrected consistency of decision outcomes.Random effects were most evident in the relative mid-range of competitiveness.Results support aggressive high- and low-end stratification or triaging of applications for subsequent stages of review, with the proportion and set of mid-range submissions to be retained for further consideration being dependent on overall success rate.

View Article: PubMed Central - PubMed

Affiliation: Research Knowledge Translation and Ethics Portfolio, Canadian Institutes of Health Research, Ottawa, Ontario, Canada.

ABSTRACT
Peer review represents the primary mechanism used by funding agencies to allocate financial support and by journals to select manuscripts for publication, yet recent Cochrane reviews determined literature on peer review best practice is sparse. Key to improving the process are reduction of inherent vulnerability to high degree of randomness and, from an economic perspective, limiting both the substantial indirect costs related to reviewer time invested and direct administrative costs to funding agencies, publishers and research institutions. Use of additional reviewers per application may increase reliability and decision consistency, but adds to overall cost and burden. The optimal number of reviewers per application, while not known, is thought to vary with accuracy of judges or evaluation methods. Here I use bootstrapping of replicated peer review data from a Post-doctoral Fellowships competition to show that five reviewers per application represents a practical optimum which avoids large random effects evident when fewer reviewers are used, a point where additional reviewers at increasing cost provides only diminishing incremental gains in chance-corrected consistency of decision outcomes. Random effects were most evident in the relative mid-range of competitiveness. Results support aggressive high- and low-end stratification or triaging of applications for subsequent stages of review, with the proportion and set of mid-range submissions to be retained for further consideration being dependent on overall success rate.

Show MeSH
Related in: MedlinePlus