Skip to contents

ggoncoplot

Usage

ggoncoplot(
  data,
  col_genes,
  col_samples,
  col_mutation_type = NULL,
  genes_to_include = NULL,
  genes_to_ignore = NULL,
  col_tooltip = col_samples,
  topn = 10,
  return_extra_genes_if_tied = FALSE,
  draw_gene_barplot = FALSE,
  draw_tmb_barplot = FALSE,
  copy = c("sample", "gene", "tooltip", "mutation_type", "nothing"),
  palette = NULL,
  metadata = NULL,
  metadata_palette = NULL,
  col_samples_metadata = col_samples,
  cols_to_plot_metadata = NULL,
  metadata_require_mutations = TRUE,
  pathway = NULL,
  col_genes_pathway = col_genes,
  show_all_samples = FALSE,
  total_samples = c("any_mutations", "all", "oncoplot"),
  interactive = TRUE,
  options = ggoncoplot_options(),
  verbose = TRUE,
  ...
)

Arguments

data

data for oncoplot. A data.frame with 1 row per mutation in your cohort. Must contain columns describing gene_symbols and sample_identifiers (data.frame)

col_genes

name of data column containing gene names/symbols (string)

col_samples

name of data column containing sample identifiers (string)

col_mutation_type

name of data column describing mutation types (string, optional)

genes_to_include

specific genes to include in the oncoplot (character, optional)

genes_to_ignore

names of the genes that should be ignored (character, optional)

col_tooltip

name of data column containing whatever information you want to display in (string, defaults to col_samples)

topn

how many of the top genes to visualize. Ignored if genes_to_include is supplied (number, default 10)

return_extra_genes_if_tied

instead of strictly returning topn genes, in the case of ties (where multiple genes are mutated in the exact same number of samples, complicating selection of top n genes), return all tied genes (potentially more than topn). If FALSE, will return strictly topn genes, breaking ties based on order of appearance in dataset (flag, default FALSE)

draw_gene_barplot

add a barplot describing number of samples with each gene mutated (right side) (flag, default FALSE)

draw_tmb_barplot

add a barplot describing total number of mutations in each sample (above main plot). If a single gene is mutated multiple times, all mutations are counted towards total (flag, default FALSE)

copy

value to copy to clipboard when an oncoplot tile is clicked (string, one of 'sample', 'gene', 'tooltip', 'mutation_type', 'nothing', default 'sample')

palette

a named vector mapping all possible mutation types (vector names) to colors (vector values, optional)

metadata

dataframe describing sample level metadata. One column must contain unique sample identifiers. Other columns can describe numeric / categorical metadata (data.frame, optional)

metadata_palette

A list of named vectors. List names correspond to metadata column names (categorical only). Vector names to levels of columns. Vector values are colors, the vector names are used to map values in data to a color. (optional)

col_samples_metadata

which column in metadata data.frame describes sample identifiers (string, defaults to col_samples)

cols_to_plot_metadata

names of columns in metadata that should be plotted (character, optional)

metadata_require_mutations

filter out samples from metadata lacking any mutations in data (flag, default TRUE)

pathway

a two column dataframe describing pathway. The column containing gene names should have the same name as col_gene (data.frame, optional)

col_genes_pathway

which column in pathway data.frame describes gene names (string, defaults to col_genes)

show_all_samples

show all samples in oncoplot, even if they don't have mutations in the selected genes. Samples only described in metadata but with no mutations at all are still filtered out by default, but you can show these too by setting metadata_require_mutations = FALSE (flag, default FALSE)

total_samples

Strategy for calculating the total number of samples. This value is used to compute the proportion of mutation recurrence displayed in the tooltip when hovering over the gene barplot, or as a text annotation when ggoncoplot_options(show_genebar_labels = TRUE) is set to TRUE.

Possible values:

  • any_mutations: All the samples that are in data (the mutation dataset), irrespective of whether they are on the oncoplot or not.

  • oncoplot: Only the samples that are present on the oncoplot.

  • all: All the samples in either data or metadata.

interactive

should plot be interactive (boolean, default TRUE)

options

a list of additional visual parameters created by calling ggoncoplot_options(). See ggoncoplot_options for details.

verbose

verbose mode (flag, default TRUE)

...

Arguments passed on to gg1d::gg1d

col_id

name of column to use for

col_sort

column to sort sample order by. By default uses the supplied order of levels in col_id (order of appearance if a character type)

maxlevels

for categorical variables, what is the maximum number of distinct values to allow (too many will make it hard to find a palette that suits). (number)

drop_unused_id_levels

if col_id is a factor with unused levels, should these be dropped or included in visualisation

debug_return_col_info

return column info instead of plots. Helpful when debugging (logical)

palettes

A list of named vectors. List names correspond to data column names (categorical only). Vector names to levels of columns. Vector values are colours, the vector names are used to map values in data to a colour.

colours_default

default colours to use for variables. will be used to colour variables with no palette supplied.

colours_default_logical

colours for binary variables (vector of 3 colors where elements represent colours of TRUE, FALSE, and NA respectively) (character)

colours_missing

colour to use for values of NA (string)

limit_plots

throw an error when there are > 15 plottable columns in table (logical)

cols_to_plot

names of columns in data that should be plotted. By default plots all valid columns (character)

sort_type

controls how categorical variables are sorted. Numerical variables are always sorted in numerical order irrespective of the value given here. Options are alphabetical or frequency

desc

sort in descending order (flag)

width

controls how much space is present between bars and tiles within each plot. Can be 0-1 where values of 1 makes bars/tiles take up 100% of available space (no gaps between bars)

relative_height_numeric

how many times taller should numeric plots be relative to categorical tile plots. Only taken into account if numeric_plot_type == "bar" (number)

tooltip_column_suffix

the suffix added to a column name that indicates column should be used as a tooltip (string)

ignore_column_regex

a regex string that, if matches a column name, will cause that column to be exclude from plotting (string) (default: "_ignore$")

show_legend_titles

show legend titles (flag)

show_legend

show the legend (flag)

legend_position

position of the legend on the plot (string, options are "right", "left", "bottom", "top")

legend_title_position

position of the title of the legend on the plot (string, options are "top", "bottom", "left", "right")

legend_title_beautify

beautify legend title (add spaces to snake_case / camelCase & capitalise each word) (flag)

numeric_plot_type

visual representation of numeric properties. One of 'bar', for bar charts, or 'heatmap' for heatmaps.

legend_nrow

the number of rows in the legend (number)

legend_ncol

the number of columns in the legend. Set legend_nrow = NULL when using legend_ncol (number)

legend_title_size

the size of the title of the legend (number)

legend_text_size

the size of the text in the legend (number)

legend_key_size

the size of the key in the legend (number)

vertical_spacing

how large should the gap between each data row be (unit = pt) (number)

na_marker

what text should be added to NA values for numeric variables to indicate the value is NA, not 0 (string)

na_marker_size

how large should the na_marker be (number)

na_marker_colour

colour of the na_marker (string)

show_na_marker_categorical

should a text marker of NA values (e.g. '!') be rendered on tiles with NA values (flag)

show_na_marker_heatmap

should a text marker of NA values (e.g. '!') be rendered on tiles with NA values (flag)

show_values_heatmap

should quantitative values be displayed on heatmap tiles (flag)

fontsize_y_text

size of y axis text (number)

y_axis_position

whether y axis should be on left or right side (either 'left' or 'right')

legend_orientation_heatmap

should legend orientation be "horizontal" or "vertical"

colours_heatmap_low

colour of lowest value in heatmap (string)

colours_heatmap_high

colour of highest value in heatmap (string)

transform_heatmap

transformation to apply to values before heatmap visualisation. one of 'identity' (no transformation), 'log10', or 'log2'

fontsize_values_heatmap

font size of text describing values in heatmap (number)

colours_values_heatmap

colour of text describing values in heatmap (string)

fontsize_barplot_y_numbers

fontsize of the text describing numeric barplot max & min values (number)

cli_header

Text used for h1 header. Included so it can be tweaked by packages that use gg1d, so they can customise how the info messages appear.

Value

ggplot or girafe object if interactive=TRUE

Examples

# ===== GBM =====
gbm_csv <- system.file(
  package = "ggoncoplot",
  "testdata/GBM_tcgamutations_mc3_maf.csv.gz"
)

gbm_clinical_csv <- system.file(
  package = "ggoncoplot",
  "testdata/GBM_tcgamutations_mc3_clinical.csv"
)

gbm_df <- read.csv(file = gbm_csv, header = TRUE)
gbm_clinical_df <- read.csv(file = gbm_clinical_csv, header = TRUE)

# Plot Basic Oncoplot
ggoncoplot(
  gbm_df,
  "Hugo_Symbol",
  "Tumor_Sample_Barcode",
  col_mutation_type = "Variant_Classification",
  metadata = gbm_clinical_df,
  cols_to_plot_metadata = "gender"
)
#>  2 samples with metadata have no mutations. Fitering these out
#>  To keep these samples, set `metadata_require_mutations = FALSE`. To view them in the oncoplot ensure you additionally set `show_all_samples = TRUE`
#> → TCGA-06-0165-01
#> → TCGA-06-0167-01
#> 
#> ── Identify Class ──
#> 
#>  Found 7 unique mutation types in input set
#>  0/7 mutation types were valid PAVE terms
#>  0/7 mutation types were valid SO terms
#>  7/7 mutation types were valid MAF terms
#>  Mutation Types are described using valid MAF terms ... using MAF palete
#> 
#> ── Plotting Sample Metadata ────────────────────────────────────────────────────
#> ! Categorical columns must have <= 6 unique values to be visualised. Columns with too many unique values:  (18),  (327), and  (327)
#> 
#> ── Sorting 
#>  Sorting X axis by: Order of appearance
#> 
#> ── Generating Plot 
#>  Found 1 plottable columns in data
# Customise how the Oncoplot looks ggoncoplot( gbm_df, "Hugo_Symbol", "Tumor_Sample_Barcode", col_mutation_type = "Variant_Classification", metadata = gbm_clinical_df, cols_to_plot_metadata = "gender", # Customise Visual Options options = ggoncoplot_options( xlab_title = "Glioblastoma Samples", ylab_title = "Top 10 mutated genes" ) ) #> 2 samples with metadata have no mutations. Fitering these out #> To keep these samples, set `metadata_require_mutations = FALSE`. To view them in the oncoplot ensure you additionally set `show_all_samples = TRUE` #> → TCGA-06-0165-01 #> → TCGA-06-0167-01 #> #> ── Identify Class ── #> #> Found 7 unique mutation types in input set #> 0/7 mutation types were valid PAVE terms #> 0/7 mutation types were valid SO terms #> 7/7 mutation types were valid MAF terms #> Mutation Types are described using valid MAF terms ... using MAF palete #> #> ── Plotting Sample Metadata ──────────────────────────────────────────────────── #> ! Categorical columns must have <= 6 unique values to be visualised. Columns with too many unique values: (18), (327), and (327) #> #> ── Sorting #> Sorting X axis by: Order of appearance #> #> ── Generating Plot #> Found 1 plottable columns in data