Research Article: Author Self-Citation in the General Medicine Literature

Date Published: June 16, 2011

Publisher: Public Library of Science

Author(s): Abhaya V. Kulkarni, Brittany Aziz, Iffat Shams, Jason W. Busse, Margaret Sampson.

Abstract: Author self-citation contributes to the overall citation count of an article and the impact factor of the journal in which it appears. Little is known, however, about the extent of self-citation in the general clinical medicine literature. The objective of this study was to determine the extent and temporal pattern of author self-citation and the article characteristics associated with author self-citation.

We performed a retrospective cohort study of articles published in three high impact general medical journals (JAMA, Lancet, and New England Journal of Medicine) between October 1, 1999 and March 31, 2000. We retrieved the number and percentage of author self-citations received by the article since publication, as of June 2008, from the Scopus citation database. Several article characteristics were extracted by two blinded, independent reviewers for each article in the cohort and analyzed in multivariable linear regression analyses. Since publication, author self-citations accounted for 6.5% (95% confidence interval 6.3–6.7%) of all citations received by the 328 articles in our sample. Self-citation peaked in 2002, declining annually thereafter. Studies with more authors, in cardiovascular medicine or infectious disease, and with smaller sample size were associated with more author self-citations and higher percentage of author self-citation (all p≤0.01).

Approximately 1 in 15 citations of articles in high-profile general medicine journals are author self-citations. Self-citation peaks within about 2 years of publication and disproportionately affects impact factor. Studies most vulnerable to this effect are those with more authors, small sample size, and in cardiovascular medicine or infectious disease.

Partial Text: Citation counts received by journal articles are used to inform decisions of academic promotion and in the assessment of research and journal impact. Author self-citation, as opposed to journal self-citation, occurs when authors reference their own publications and this practice can be regarded positively (e.g., guiding readers to important relevant research) or negatively (e.g., intentionally inflating the impact of one’s own work). Regardless of the motivation that underlies self-citation [1], there is only a limited body of literature that has addressed its quantitative impact in the general clinical medicine literature. Some studies [2], [3] have examined synchronous self-citation [4], in which the references of an index article are reviewed for previous works by the same author(s). By examining only the bibliographies of index articles, however, synchronous self-citation does not tell us about the importance of self-citation on the citation counts of the index articles, which is how the impact of articles and journals is commonly assessed. Therefore, to understand the impact of self-citation on citation counts, information is needed about diachronous self-citation [4], in which a citation database is used to establish when an index article is cited by future publications from the same author(s). Diachronous self-citation contributes directly to the overall citation count and could alter perceptions about the impact of an article or journal in which it appears. Glanzel et al., in a “macro level” analysis [5], showed that diachronous self-citations occur earlier after publication than non-self citations, but they did not explore associations between individual article characteristics and self-citation. Other studies of diachronous self-citation are limited by narrow clinical focus [6], narrow geographical focus [7], or limited post-publication windows [6], [8]. We used Scopus (Elsevier), a relatively new citation database that includes a feature to isolate author self-citations from total citation counts, to examine diachronous self-citation in a large cohort of articles published in high-profile general medicine journals over 8 years ago. We were specifically interested in identifying the relative contribution of author self-citation to the overall citation count of articles and to determine if specific article characteristics were associated with author self-citation.

Through hand-searching, we acquired a sample of original research papers published in JAMA, Lancet, and New England Journal of Medicine (NEJM) between October 1, 1999 and March 30, 2000 [9], [10]. This period of publication is well within the coverage range of Scopus [11]. We included all articles published under the following table of content headings: “Original Contributions” in JAMA, “Original Research–Articles” in Lancet, and “Original Articles” in NEJM.

Characteristics of the 328 index articles are shown in Table 1. The median sample size for these articles was 642 (interquartile range = 147 to 6363) and the median number of authors was 5 (interquartile range = 4 to 9). In a random sample of 20 of the 328 articles, the sensitivity of Scopus in accurately identifying author self-citations was 95.5% and the specificity was 100% (Scopus identified 298 self-citations, of which none were false-positive, and 3601 non-self-citations, of which 14 were false-negative).

In a cohort of articles published in high-profile general medicine journals, we found that, over an 8 year follow-up period, approximately 6.5% (95% CI 6.3–6.7) of citations received were author self-citations and 1.1% (95% CI 1.0–1.1) were journal self-citations. Author self-citations peaked about two years after publication and then declined progressively thereafter (Figure 1). Studies with more authors, those in cardiovascular medicine or infectious disease, and those with a smaller sample size were associated with more author self-citations and higher percent of author self-citation per article. Publication in JAMA and more non-self-citations were also associated with more total author self-citations per article.