Research Article: New framework for automated article selection applied to a literature review of Enhanced Biological Phosphorus Removal

Date Published: May 9, 2019

Publisher: Public Library of Science

Author(s): Minh Nguyen Quang, Tim Rogers, Jan Hofman, Ana B. Lanham, Nicolas Roche.


Enhanced Biological Phosphorus Removal (EBPR) is a technology widely used in wastewater treatment to remove phosphorus (P) and prevent eutrophication. Establishing its operating efficiency and stability is an active research field that has generated almost 3000 publications in the last 40 years. Due to its size, including over 119 review articles, it is an example of a field where it becomes increasingly difficult to manually recognize its key research contributions, especially for non-experts or newcomers. Therefore, this work included two distinct but complementary objectives. First, to assemble for the first time a collection of bibliometric techniques into a framework for automating the article selection process when preparing a literature review (section 2). Second, to demonstrate it by applying it to the field of EBPR, producing a bibliometric analysis and a review of the key findings of EBPR research over time (section 3).

The joint analysis of citation networks, keywords, citation profiles, as well as of specific benchmarks for the identification of highly-cited publications revealed 12 research topics. Their content and evolution could be manually reviewed using a selection of articles consisting of approximately only 5% of the original set of publications. The largest topics addressed the identification of relevant microorganisms, the characterization of their metabolism, including denitrification and the competition between them (Clusters A-D). Emerging and influential topics, as determined by different citation indicators and temporal analysis, were related to volatile fatty acid production, P-recovery from waste activated sludge and aerobic granules for better process efficiency and stability (Clusters F-H).

The framework enabled key contributions in each of the constituent topics to be highlighted in a way that may have otherwise been biased by conventional citation-based ranking. Further, it reduced the need for manual input and a priori expertise compared to a traditional literature review. Hence, in an era of accelerated production of information and publications, this work contributed to the way that we are able to use computer-aided approaches to curate information and manage knowledge.

Partial Text

Eutrophication—the over-abundance of certain nutrients in water bodies, unbalancing local ecosystems—is a major environmental concern. To prevent this, numerous technologies have been developed to remove phosphorus from wastewater [1]. Of these technologies, Enhanced Biological Phosphorus Removal (EBPR) is perhaps one of the most popular choices in wastewater treatment plants (WWTP), especially for those with larger capacities. It is essentially a variation of conventional Activated Sludge (AS). By engineering alternating anaerobic, aerobic and often anoxic conditions, the resulting community of microorganisms removes phosphorus (P) by intra-cellular accumulation. Primarily, EBPR offers an alternative to chemical precipitation for P-removal. However, it also became an extremely interesting model of microbial ecology applied to complex engineered environmental systems.

The bibliometric framework described in section 2 was applied to the EBPR corpus in order to provide a bibliometric and systematic review of the field. The goal for this approach was to understand, after 40 years of research: 1) the key topics within the field; 2) infer their importance and development through their size and inter-relationships; 3) analyze their temporal evolution as determined by citation-based indicators; 4) review each topic’s key research developments as determined by an automated citation-based selection.

This work proposed a framework for facilitating the selection of articles when conducting a literature review based on the combination of bibliometric techniques and other indicators applied to EBPR. In section 2, two clustering techniques were compared, bibliographic coupling and co-citation, with the former yielding the best results. The framework also included the use of keyword analysis, citation profiles, statistical analysis of dates of publication and benchmarks of citation counts as indicators to measure impact in the form of popularity, temporal trends and patterns in the flow of information. The fact that all this information can be statistically retrieved as opposed to relying uniquely on expert judgment, is the main advantage of using this framework for newcomers to a certain research field or even for experienced researchers to obtain a more systematic perspective. As a result of the framework, in section 3, twelve clusters were obtained, each equivalent to a topic under the umbrella of EBPR and a literature review was produced based on less than 5% of the size of the original EBPR corpus, meaning a less time-consuming approach.




Leave a Reply

Your email address will not be published.