Skip to contents

Aim

To build a sunburst chart that represents microbial composition of some microbiome sample. Our input will be a vector of ncbi taxonomy IDs (taxid).

A taxids abundance = the frequency in input vector

Libraries

We’ll need two R packages sunburst and taxizedbextra

# remotes::install_github("selkamand/sunburst")
library(sunburst) 

# remotes::install_github("selkamand/taxizedbextra")
library(taxizedbextra) 

Download ncbi taxonomy database

First, lets use `db_download_ncbi()` from taxizedb (exposed by taxizedbextra) to download the ncbi taxonomy database locally to help us build sunburst plots from taxonomy ids at blistering speed. Its just over 2 gigs so might take a while, but it’ll be worth it down the road.

# Download ncbi taxonomy database. 
db_download_ncbi(overwrite = TRUE)

On my macbook it saves the db to ~/Library/Caches/R/taxizedb. You can check where its downloaded for you by running locate_taxonomy_cache()

# Where is my taxonomy database downloaded to?
locate_taxonomy_cache()
#> <hoard> 
#>   path: taxizedb
#>   cache path: ~/.cache/R/taxizedb

Get data for sunburst plot

We need to get data in the required format (numeric vector of ncbi taxids) You can use taxid = -1 for ‘unclassified sequences’

# Here we're simulating our data
taxids = c(rep(561, times = 10), rep(1639, times = 20), rep(529731, times = 10))

taxids
#>  [1]    561    561    561    561    561    561    561    561    561    561
#> [11]   1639   1639   1639   1639   1639   1639   1639   1639   1639   1639
#> [21]   1639   1639   1639   1639   1639   1639   1639   1639   1639   1639
#> [31] 529731 529731 529731 529731 529731 529731 529731 529731 529731 529731

Create sunburst plot

# generate sunburst plot
microbial_sunburst(
  taxids = taxids, 
  ranks_to_include = c("species", "genus", "family")
  )
#> [ℹ] Getting taxid lineages
#> [ℹ] Constructing sunburst plot
#> Registered S3 method overwritten by 'httr':
#>   method           from  
#>   print.cache_info hoardr

run ?microbial_sunburst() to learn how to customise this plot

Acknowledgements

The engines driving the functionality of this package are the taxizedb and plotly packages. A big thanks to all involved in the creation and maintenance of these packages.