Research Article: The emergence and evolution of the research fronts in HIV/AIDS research

Date Published: May 25, 2017

Publisher: Public Library of Science

Author(s): David Fajardo-Ortiz, Malaquias Lopez-Cervantes, Luis Duran, Michel Dumontier, Miguel Lara, Hector Ochoa, Victor M. Castano, Dimitrios Paraskevis.


In this paper, we have identified and analyzed the emergence, structure and dynamics of the paradigmatic research fronts that established the fundamentals of the biomedical knowledge on HIV/AIDS. A search of papers with the identifiers “HIV/AIDS”, “Human Immunodeficiency Virus”, “HIV-1” and “Acquired Immunodeficiency Syndrome” in the Web of Science (Thomson Reuters), was carried out. A citation network of those papers was constructed. Then, a sub-network of the papers with the highest number of inter-citations (with a minimal in-degree of 28) was selected to perform a combination of network clustering and text mining to identify the paradigmatic research fronts and analyze their dynamics. Thirteen research fronts were identified in this sub-network. The biggest and oldest front is related to the clinical knowledge on the disease in the patient. Nine of the fronts are related to the study of specific molecular structures and mechanisms and two of these fronts are related to the development of drugs. The rest of the fronts are related to the study of the disease at the cellular level. Interestingly, the emergence of these fronts occurred in successive “waves” over the time which suggest a transition in the paradigmatic focus. The emergence and evolution of the biomedical fronts in HIV/AIDS research is explained not just by the partition of the problem in elements and interactions leading to increasingly specialized communities, but also by changes in the technological context of this health problem and the dramatic changes in the epidemiological reality of HIV/AIDS that occurred between 1993 and 1995.

Partial Text

The Human Immunodeficiency Virus/ Acquired Immunodeficiency Syndrome (HIV/AIDS) is a global health problem: over 70 million people have been infected with HIV, 35 million have died and 36.7 million people currently live with the disease [1]. HIV/AIDS is one of the most studied infection diseases with more than 260,000 papers (mentioning the topic) listed in GOPubMed [2] and more than 42,000 papers (mentioning HIV/AIDS in the title) in the Web of Science [3] spanning over thirty year of scientific research. HIV/AIDS is studied by a plurality of biomedical disciplines like epidemiology [4], virology [5], immunology [6] or drug development [7] and non-biomedical disciplines like social sciences [8] and humanities [9]. All the biomedical disciplines working on HIV/AIDS strongly rely on a solid scientific consensus, which explains the clinical manifestation of HIV/AIDS in terms of the virus interactions with the immune system cells; the behavior and demography of the immune system cells, and, most importantly, the virus interaction with the biomolecular machinery of the host cells [10–12]. Two features are believed to be at the core of the scientific consensus on HIV/AIDS: the natural history of the HIV infection (the number of CD4+ cells and HIV RNA copies plotted over the time) [11] and the virus replication cycle (from the virus entry to the virus assembly, budding and maturation) [10, 12].

According to the Cytoscape analysis, the network of HIV/AIDS papers displays a power law distribution of their citations, which has important methodological implications. The first implication is that a research front (a citation network module) could be formed by other research fronts, which in turn can be partioned into sub-modules [44]. The second implication is that the nodes (papers) with the highest indegree tend to be more “cosmopolitan” i.e., they have the lowest clustering coefficient values [44]. That is, they could belong simultaneously to several fronts or any of them. Therefore, there are not clearly defined frontiers dividing the research fronts. However, the standard in the scientometrics study of research fronts seems to be to use clustering methods that define frontiers between the modules [14, 17, 18, 24, 25, 27, 28], probably because this is make much more understandable the community structure in a literature network. Moreover, our analysis reveals the front that is most relevant to each paper. Finally, the most important implication is that in a hierarchical literature network the most the papers with the highest indegree are related with the paradigms that organize a research field or topic [14]. Top cited papers have been extensively used to identify the scientific achievements that establish the standards of research practice of a particular community [14, 17, 22, 23, 27, 28, 45, 46]. There are no set guidelines on the proportion of top cited papers that should be selected. However, there is a trade off between selecting the most informative papers and maintaining diversity of the information [14, 17]. In this work, we used a minimal indegree of 30 to select the top cited papers, and in turn obtaining a considerable percentage of the citations. The selected papers consist of only ten percent of the network but they effectively account for two thirds of the citations. The selected papers are a reasonable representation of the paradigmatic core of HIV/AIDS research.