ggoncoplot
Usage
ggoncoplot(
data,
col_genes,
col_samples,
col_mutation_type = NULL,
genes_to_include = NULL,
genes_to_ignore = NULL,
col_tooltip = col_samples,
topn = 10,
return_extra_genes_if_tied = FALSE,
draw_gene_barplot = FALSE,
draw_tmb_barplot = FALSE,
copy = c("sample", "gene", "tooltip", "mutation_type", "nothing"),
palette = NULL,
metadata = NULL,
metadata_palette = NULL,
col_samples_metadata = col_samples,
cols_to_plot_metadata = NULL,
metadata_require_mutations = TRUE,
pathway = NULL,
col_genes_pathway = col_genes,
show_all_samples = FALSE,
total_samples = c("any_mutations", "all", "oncoplot"),
interactive = TRUE,
options = ggoncoplot_options(),
verbose = TRUE,
...
)
Arguments
- data
data for oncoplot. A data.frame with 1 row per mutation in your cohort. Must contain columns describing gene_symbols and sample_identifiers (data.frame)
- col_genes
name of data column containing gene names/symbols (string)
- col_samples
name of data column containing sample identifiers (string)
- col_mutation_type
name of data column describing mutation types (string, optional)
- genes_to_include
specific genes to include in the oncoplot (character, optional)
- genes_to_ignore
names of the genes that should be ignored (character, optional)
- col_tooltip
name of data column containing whatever information you want to display in (string, defaults to col_samples)
- topn
how many of the top genes to visualize. Ignored if
genes_to_include
is supplied (number, default 10)- return_extra_genes_if_tied
instead of strictly returning
topn
genes, in the case of ties (where multiple genes are mutated in the exact same number of samples, complicating selection of top n genes), return all tied genes (potentially more than topn). If FALSE, will return strictlytopn
genes, breaking ties based on order of appearance in dataset (flag, default FALSE)- draw_gene_barplot
add a barplot describing number of samples with each gene mutated (right side) (flag, default FALSE)
- draw_tmb_barplot
add a barplot describing total number of mutations in each sample (above main plot). If a single gene is mutated multiple times, all mutations are counted towards total (flag, default FALSE)
- copy
value to copy to clipboard when an oncoplot tile is clicked (string, one of 'sample', 'gene', 'tooltip', 'mutation_type', 'nothing', default 'sample')
- palette
a named vector mapping all possible mutation types (vector names) to colors (vector values, optional)
- metadata
dataframe describing sample level metadata. One column must contain unique sample identifiers. Other columns can describe numeric / categorical metadata (data.frame, optional)
- metadata_palette
A list of named vectors. List names correspond to metadata column names (categorical only). Vector names to levels of columns. Vector values are colors, the vector names are used to map values in data to a color. (optional)
- col_samples_metadata
which column in metadata data.frame describes sample identifiers (string, defaults to col_samples)
- cols_to_plot_metadata
names of columns in metadata that should be plotted (character, optional)
- metadata_require_mutations
filter out samples from metadata lacking any mutations in data (flag, default TRUE)
- pathway
a two column dataframe describing pathway. The column containing gene names should have the same name as col_gene (data.frame, optional)
- col_genes_pathway
which column in pathway data.frame describes gene names (string, defaults to col_genes)
- show_all_samples
show all samples in oncoplot, even if they don't have mutations in the selected genes. Samples only described in metadata but with no mutations at all are still filtered out by default, but you can show these too by setting
metadata_require_mutations = FALSE
(flag, default FALSE)- total_samples
Strategy for calculating the total number of samples. This value is used to compute the proportion of mutation recurrence displayed in the tooltip when hovering over the gene barplot, or as a text annotation when
ggoncoplot_options(show_genebar_labels = TRUE)
is set to TRUE.Possible values:
any_mutations: All the samples that are in
data
(the mutation dataset), irrespective of whether they are on the oncoplot or not.oncoplot: Only the samples that are present on the oncoplot.
all: All the samples in either
data
ormetadata
.
- interactive
should plot be interactive (boolean, default TRUE)
- options
a list of additional visual parameters created by calling
ggoncoplot_options()
. Seeggoncoplot_options
for details.- verbose
verbose mode (flag, default TRUE)
- ...
Arguments passed on to
gg1d::gg1d
col_id
name of column to use for
col_sort
column to sort sample order by. By default uses the supplied order of levels in col_id (order of appearance if a character type)
maxlevels
for categorical variables, what is the maximum number of distinct values to allow (too many will make it hard to find a palette that suits). (number)
drop_unused_id_levels
if col_id is a factor with unused levels, should these be dropped or included in visualisation
debug_return_col_info
return column info instead of plots. Helpful when debugging (logical)
palettes
A list of named vectors. List names correspond to data column names (categorical only). Vector names to levels of columns. Vector values are colours, the vector names are used to map values in data to a colour.
colours_default
default colours to use for variables. will be used to colour variables with no palette supplied.
colours_default_logical
colours for binary variables (vector of 3 colors where elements represent colours of TRUE, FALSE, and NA respectively) (character)
colours_missing
colour to use for values of NA (string)
limit_plots
throw an error when there are > 15 plottable columns in table (logical)
cols_to_plot
names of columns in data that should be plotted. By default plots all valid columns (character)
sort_type
controls how categorical variables are sorted. Numerical variables are always sorted in numerical order irrespective of the value given here. Options are
alphabetical
orfrequency
desc
sort in descending order (flag)
width
controls how much space is present between bars and tiles within each plot. Can be 0-1 where values of 1 makes bars/tiles take up 100% of available space (no gaps between bars)
relative_height_numeric
how many times taller should numeric plots be relative to categorical tile plots. Only taken into account if numeric_plot_type == "bar" (number)
tooltip_column_suffix
the suffix added to a column name that indicates column should be used as a tooltip (string)
ignore_column_regex
a regex string that, if matches a column name, will cause that column to be exclude from plotting (string) (default: "_ignore$")
show_legend_titles
show legend titles (flag)
show_legend
show the legend (flag)
legend_position
position of the legend on the plot (string, options are "right", "left", "bottom", "top")
legend_title_position
position of the title of the legend on the plot (string, options are "top", "bottom", "left", "right")
legend_title_beautify
beautify legend title (add spaces to snake_case / camelCase & capitalise each word) (flag)
numeric_plot_type
visual representation of numeric properties. One of 'bar', for bar charts, or 'heatmap' for heatmaps.
legend_nrow
the number of rows in the legend (number)
legend_ncol
the number of columns in the legend. Set
legend_nrow = NULL
when using legend_ncol (number)legend_title_size
the size of the title of the legend (number)
legend_text_size
the size of the text in the legend (number)
legend_key_size
the size of the key in the legend (number)
vertical_spacing
how large should the gap between each data row be (unit = pt) (number)
na_marker
what text should be added to NA values for numeric variables to indicate the value is NA, not 0 (string)
na_marker_size
how large should the na_marker be (number)
na_marker_colour
colour of the na_marker (string)
show_na_marker_categorical
should a text marker of NA values (e.g. '!') be rendered on tiles with NA values (flag)
show_na_marker_heatmap
should a text marker of NA values (e.g. '!') be rendered on tiles with NA values (flag)
show_values_heatmap
should quantitative values be displayed on heatmap tiles (flag)
fontsize_y_text
size of y axis text (number)
y_axis_position
whether y axis should be on left or right side (either 'left' or 'right')
legend_orientation_heatmap
should legend orientation be "horizontal" or "vertical"
colours_heatmap_low
colour of lowest value in heatmap (string)
colours_heatmap_high
colour of highest value in heatmap (string)
transform_heatmap
transformation to apply to values before heatmap visualisation. one of 'identity' (no transformation), 'log10', or 'log2'
fontsize_values_heatmap
font size of text describing values in heatmap (number)
colours_values_heatmap
colour of text describing values in heatmap (string)
fontsize_barplot_y_numbers
fontsize of the text describing numeric barplot max & min values (number)
cli_header
Text used for h1 header. Included so it can be tweaked by packages that use gg1d, so they can customise how the info messages appear.
Examples
# ===== GBM =====
gbm_csv <- system.file(
package = "ggoncoplot",
"testdata/GBM_tcgamutations_mc3_maf.csv.gz"
)
gbm_clinical_csv <- system.file(
package = "ggoncoplot",
"testdata/GBM_tcgamutations_mc3_clinical.csv"
)
gbm_df <- read.csv(file = gbm_csv, header = TRUE)
gbm_clinical_df <- read.csv(file = gbm_clinical_csv, header = TRUE)
# Plot Basic Oncoplot
ggoncoplot(
gbm_df,
"Hugo_Symbol",
"Tumor_Sample_Barcode",
col_mutation_type = "Variant_Classification",
metadata = gbm_clinical_df,
cols_to_plot_metadata = "gender"
)
#> ℹ 2 samples with metadata have no mutations. Fitering these out
#> ℹ To keep these samples, set `metadata_require_mutations = FALSE`. To view them in the oncoplot ensure you additionally set `show_all_samples = TRUE`
#> → TCGA-06-0165-01
#> → TCGA-06-0167-01
#>
#> ── Identify Class ──
#>
#> ℹ Found 7 unique mutation types in input set
#> ℹ 0/7 mutation types were valid PAVE terms
#> ℹ 0/7 mutation types were valid SO terms
#> ℹ 7/7 mutation types were valid MAF terms
#> ✔ Mutation Types are described using valid MAF terms ... using MAF palete
#>
#> ── Plotting Sample Metadata ────────────────────────────────────────────────────
#> ! Categorical columns must have <= 6 unique values to be visualised. Columns with too many unique values: (18), (327), and (327)
#>
#> ── Sorting
#> ℹ Sorting X axis by: Order of appearance
#>
#> ── Generating Plot
#> ℹ Found 1 plottable columns in data
# Customise how the Oncoplot looks
ggoncoplot(
gbm_df,
"Hugo_Symbol",
"Tumor_Sample_Barcode",
col_mutation_type = "Variant_Classification",
metadata = gbm_clinical_df,
cols_to_plot_metadata = "gender",
# Customise Visual Options
options = ggoncoplot_options(
xlab_title = "Glioblastoma Samples",
ylab_title = "Top 10 mutated genes"
)
)
#> ℹ 2 samples with metadata have no mutations. Fitering these out
#> ℹ To keep these samples, set `metadata_require_mutations = FALSE`. To view them in the oncoplot ensure you additionally set `show_all_samples = TRUE`
#> → TCGA-06-0165-01
#> → TCGA-06-0167-01
#>
#> ── Identify Class ──
#>
#> ℹ Found 7 unique mutation types in input set
#> ℹ 0/7 mutation types were valid PAVE terms
#> ℹ 0/7 mutation types were valid SO terms
#> ℹ 7/7 mutation types were valid MAF terms
#> ✔ Mutation Types are described using valid MAF terms ... using MAF palete
#>
#> ── Plotting Sample Metadata ────────────────────────────────────────────────────
#> ! Categorical columns must have <= 6 unique values to be visualised. Columns with too many unique values: (18), (327), and (327)
#>
#> ── Sorting
#> ℹ Sorting X axis by: Order of appearance
#>
#> ── Generating Plot
#> ℹ Found 1 plottable columns in data