Limits...
Inter-rater agreement in the scoring of abstracts submitted to a primary care research conference.

Montgomery AA, Graham A, Evans PH, Fahey T - BMC Health Serv Res (2002)

Bottom Line: Mean total scores of abstracts accepted and rejected for the meeting were compared using an unpaired t test.Mean score for accepted abstracts was significantly greater than those that were rejected (17.4 versus 14.6, 95% CI for difference 1.3 to 4.1, p = 0.0003).However in terms of overall quality scores, abstracts accepted for the meeting were rated significantly higher than those that were rejected.

View Article: PubMed Central - HTML - PubMed

Affiliation: Division of Primary Health Care, University of Bristol, Cotham House, Cotham Hill, Bristol BS8 2PR, UK. alan.a.montgomery@bristol.ac.uk

ABSTRACT

Background: Checklists for peer review aim to guide referees when assessing the quality of papers, but little evidence exists on the extent to which referees agree when evaluating the same paper. The aim of this study was to investigate agreement on dimensions of a checklist between two referees when evaluating abstracts submitted for a primary care conference.

Methods: Anonymised abstracts were scored using a structured assessment comprising seven categories. Between one (poor) and four (excellent) marks were awarded for each category, giving a maximum possible score of 28 marks. Every abstract was assessed independently by two referees and agreement measured using intraclass correlation coefficients. Mean total scores of abstracts accepted and rejected for the meeting were compared using an unpaired t test.

Results: Of 52 abstracts, agreement between reviewers was greater for three components relating to study design (adjusted intraclass correlation coefficients 0.40 to 0.45) compared to four components relating to more subjective elements such as the importance of the study and likelihood of provoking discussion (0.01 to 0.25). Mean score for accepted abstracts was significantly greater than those that were rejected (17.4 versus 14.6, 95% CI for difference 1.3 to 4.1, p = 0.0003).

Conclusions: The findings suggest that inclusion of subjective components in a review checklist may result in greater disagreement between reviewers. However in terms of overall quality scores, abstracts accepted for the meeting were rated significantly higher than those that were rejected.

Show MeSH

Related in: MedlinePlus

Difference between referees' scores versus mean score
© Copyright Policy
Related In: Results  -  Collection


getmorefigures.php?uid=PMC101393&req=5

Figure 1: Difference between referees' scores versus mean score


Inter-rater agreement in the scoring of abstracts submitted to a primary care research conference.

Montgomery AA, Graham A, Evans PH, Fahey T - BMC Health Serv Res (2002)

Difference between referees' scores versus mean score
© Copyright Policy
Related In: Results  -  Collection

Show All Figures
getmorefigures.php?uid=PMC101393&req=5

Figure 1: Difference between referees' scores versus mean score
Bottom Line: Mean total scores of abstracts accepted and rejected for the meeting were compared using an unpaired t test.Mean score for accepted abstracts was significantly greater than those that were rejected (17.4 versus 14.6, 95% CI for difference 1.3 to 4.1, p = 0.0003).However in terms of overall quality scores, abstracts accepted for the meeting were rated significantly higher than those that were rejected.

View Article: PubMed Central - HTML - PubMed

Affiliation: Division of Primary Health Care, University of Bristol, Cotham House, Cotham Hill, Bristol BS8 2PR, UK. alan.a.montgomery@bristol.ac.uk

ABSTRACT

Background: Checklists for peer review aim to guide referees when assessing the quality of papers, but little evidence exists on the extent to which referees agree when evaluating the same paper. The aim of this study was to investigate agreement on dimensions of a checklist between two referees when evaluating abstracts submitted for a primary care conference.

Methods: Anonymised abstracts were scored using a structured assessment comprising seven categories. Between one (poor) and four (excellent) marks were awarded for each category, giving a maximum possible score of 28 marks. Every abstract was assessed independently by two referees and agreement measured using intraclass correlation coefficients. Mean total scores of abstracts accepted and rejected for the meeting were compared using an unpaired t test.

Results: Of 52 abstracts, agreement between reviewers was greater for three components relating to study design (adjusted intraclass correlation coefficients 0.40 to 0.45) compared to four components relating to more subjective elements such as the importance of the study and likelihood of provoking discussion (0.01 to 0.25). Mean score for accepted abstracts was significantly greater than those that were rejected (17.4 versus 14.6, 95% CI for difference 1.3 to 4.1, p = 0.0003).

Conclusions: The findings suggest that inclusion of subjective components in a review checklist may result in greater disagreement between reviewers. However in terms of overall quality scores, abstracts accepted for the meeting were rated significantly higher than those that were rejected.

Show MeSH
Related in: MedlinePlus