Logo Logo

Generate PanViz

Input formatting example (use csv)Download example

Name GO EC Strain 1 Strain 2 Strain 3 ...
Gene 1 GO:0009893; GO:0010166 2.4.2.10 1 1 1 ...
Gene 2 GO:0042810 2.4.2.37 0 1 0 ...
Gene 3 2 1 3 ...
Gene 4 GO:0044249; GO:0044271 1 0 1 ...
Gene 5 GO:1902591; GO:0044763; GO:0009987 4.1.1.28; 4.1.1.26 0 1 1 ...
... ... ... ... ... ... ...

About PanViz

PanViz is a visualisation tool developed to allow users to intuitively navigate pangenomic data generated in any way. It sets itself apart from other solutions such as blast atlas and mauve in that it is scalable and does not require the use of a reference genome. Furthermore it is build using modern web technologies and doesn't require installation of any software (safe for a browser). This makes it as easy to share as a PDF file, but interactive and animated in a way that a PDF can only dream about. PanViz has been developed using the amazing D3.js framework in order to make your pangenomic data come to life. I sincerely hope you will enjoy playing with it - go make some cool science now!

Features

PanViz has been created to make it easy to explore the content and structure of a pangenome without relying on any specific analysis type or algorithm. As such it sports three different views into the pangenome data as well as a visual querying mechanism that is available across all views. The different views will be described below. Do note that one visualisation will seldom reveal all information in a data structure. While I believe PanViz is a valuable tool for exploring comparative genomics data, it should be coupled with other types of analyses and visualisations to create the full picture.

Pangenome dynamics

While most comparative genomics visualisation techniques are static in the sense that the pangenome being investigated doesn't change, PanViz has been build with the aim of visualising the dynamics in gene group domain as the members of the pangenome changes. The member organisms of the pangenome can at any time be updated by selecting a subtree from the full pangenome and the visualisation will update accordingly. If the circle view is active then the movement of genes between the different domains is animated and can later be queried by hovering the mouse over a GO group of interest. The currently active pangenome will at any time be shown as a blue rectangle behind the dendrogram.

Strain-pangenome comparison

Apart from looking at the dynamics of a pangenome, the currently selected pangenome can at any time be compared to one or two strains by selecting these - either in the dendrogram or in the scatterplot. Comparison is shown by drawing links between the top level GO's from the strain and to their respective locations in the pangenome. The thickness of the link indicate the number of genegroups and the colour the domain they link to. If two strains are selected each link has two shades. The darkest part indicate the gene groups the two strains have in common, while the light part indicate gene groups unique to that strain. Strains do not need to be a part of the pangenome they are compared to. If they are not, the number of genes not a part of the pangenome in each GO is indicated by a lack of link.

Gene ontology navigation

While most views in the visualisation only utilises the top level gene ontology annotation, the full GO graph is available for navigation by clicking on one of the GO wedges in the circle view or GO bars in the strain comparison view. This will let you dive into the gene groups contained in this ontology by zooming in on a treemap where each node/rectangle is scaled to the number of gene groups contained within. By updating the pangenome while navigating the gene ontology it is possible to see how the distributions of gene ontology terms changes. Likewise differences between distributions within the Core, Accessory and Singleton domains can be seen in this way.

Visual querying

While dynamic visualisations are nice they often make you wonder. After all this is what they are designed for. Because of this it should also be possible to extract the underlying information for each visual element in order to test hypothesis or build new ones. As all visual elements in PanViz (except for scales and legends) represent a set of gene groups it has been made easy for the user to select these and make set operations on them (union, intersect, compliment etc.). The list in the bottom of the visualisation contains the current set (the full list of gene groups at the onset), and gets updated as elements are queried. Using these tools it is possible to make very powerful selections in a visually intuitive way.

Easy sharing

PanViz is build with modern webtechnologies and only requires a fairly recent webbrowser to work. The provided file is fully selfcontained and can be opened without any internet connection, but is also fit to be hosted on a server to be shared as a webpage. The lack of need for any additional installation makes this as easy to share as the more common static visualisations usually found in scientific litterature.

Share

References

Pedersen, Thomas Lin (2015). PanViz. Scalable and unbiased pangenome visualisation. In preparation