Research Article: Empirical Comparison of Visualization Tools for Larger-Scale Network Analysis

Date Published: July 18, 2017

Publisher: Hindawi

Author(s): Georgios A. Pavlopoulos, David Paez-Espino, Nikos C. Kyrpides, Ioannis Iliopoulos.


Gene expression, signal transduction, protein/chemical interactions, biomedical literature cooccurrences, and other concepts are often captured in biological network representations where nodes represent a certain bioentity and edges the connections between them. While many tools to manipulate, visualize, and interactively explore such networks already exist, only few of them can scale up and follow today’s indisputable information growth. In this review, we shortly list a catalog of available network visualization tools and, from a user-experience point of view, we identify four candidate tools suitable for larger-scale network analysis, visualization, and exploration. We comment on their strengths and their weaknesses and empirically discuss their scalability, user friendliness, and postvisualization capabilities.

Partial Text

Health and natural sciences have become protagonists in the big-data world as high-throughput advances continuously contribute to the exponential growth of data volumes. Nowadays, biological repositories expand every day by hosting various entities such as proteins, genes, drugs, chemicals, ontologies, functions, articles, and the interactions between them, often leading to large-scale networks of thousands or even millions of nodes and connections. As such networks are characterized by different properties and topologies, graph theory comes to play a very important role by providing ways to efficiently store, analyze, and subsequently visualize them [1–5].

Despite the great plethora of available network visualization tools, due to the continuous increase of the data volume in health sciences, visualization and manipulation of large-scale networks with million nodes and edges still remain a bottleneck. While noninteractive libraries such as the Stanford Network Analysis Project (SNAP) [47], the outdated Large Graph Layout (LGL) [48], NetworkX [49], or the GraphViz [50] are preferred for backend calculations and large-scale static visualizations and while alternative network visualizations such as the ones offered by the Circos [51], HivePlots [52], or BioFabric [53] can partially solve the hairball effect, the implementation of user friendly interactive tools to handle and visualize such large graphs still remains a very complicated task. Therefore, for the purposes of this review article, we tested several available standalone applications and concluded that Pajek, Tulip Gephi, and Cytoscape are top candidates for large-scale network visualization and analysis.

It is unfair and not straightforward to directly compare visualization tools with each other as they are implemented to serve different purposes. Nevertheless, as biological network sizes increase over time, combining the complementary advantages from different tools is a good strategy. While several file formats to describe the structure of network have been standardized, our experience showed that many of them cannot be properly exported or imported across several tools. In addition, even in the best cases where such an import/export problem is absent, often node and edge attributes cannot be transferred. Therefore, we believe that a catholic network converted to accurately convert a file format into any other by simultaneously keeping the maximum information about the network’s components is mandatory. This way, switching between tools and various visualizations will become easier and more straightforward.




Leave a Reply

Your email address will not be published.