Visualize relationships between numeric variables and categorical groupings using parallel coordinate plots.
Usage
ggparallel(
data,
col_id = NULL,
col_colour = NULL,
highlight = NULL,
interactive = TRUE,
order_columns_by = c("appearance", "random", "auto"),
order_observations_by = c("frequency", "original"),
verbose = TRUE,
palette_colour = palette.colors(palette = "Set2"),
palette_highlight = c("red", "grey90"),
convert_binary_numeric_to_factor = TRUE,
scaling = c("uniminmax", "none"),
return = c("plot", "data"),
options = ggparallel_options()
)
Arguments
- data
A data frame containing the variables to plot.
- col_id
The name of the column to use as an identifier. If
NULL
, artificial IDs will be generated based on row numbers. (character)- col_colour
Name of the column to use for coloring lines in the plot. If
NULL
, no coloring is applied. (character)- highlight
A level from
col_colour
to emphasize in the plot. Ignored ifcol_colour
is not set. (character)- interactive
Produce interactive ggiraph visualiastion (flag)
- order_columns_by
Strategy for ordering columns in the plot. Options include:
"appearance": Order columns by their order in
data
(default)."random": Randomly order columns.
"auto": Automatically order columns based on context:
If
highlight
is set, columns are ordered to maximize separation between the highlighted level and all others, using mutual information.If
col_colour
is set buthighlight
is not, columns are ordered based on mutual information with all classes incol_colour
.If neither
highlight
norcol_colour
is set, columns are ordered to minimize the estimated number of crossings, using a repetitive nearest neighbour approach with two-opt refinement.
- order_observations_by
Strategy for ordering lines in the plot. Options include:
"frequency": Draw the largest groups first.
"original": Preserve the original order in
data
.
Ignored if
highlight
is set.- verbose
Logical; whether to display informative messages during execution. (default:
TRUE
)- palette_colour
A named vector of colors for categorical levels in
col_colour
. (default: Set2 palette)- palette_highlight
A two-color vector for highlighting (
highlight
and others). (default:c("red", "grey90")
)- convert_binary_numeric_to_factor
Logical; whether to convert numeric columns containing only 0, 1, and NA to factors. (default:
TRUE
)- scaling
Method for scaling numeric variables. Options include:
"uniminmax": Rescale each variable to range [0, 1].
"none": No rescaling. Use raw values.
- return
What to return. Options include:
"plot": Return the ggplot object (default).
"data": Return the processed data used for plotting.
- options
A list of additional visualization parameters created by
ggparallel_options()
.
Examples
ggparallel(
data = minibeans,
col_colour = "Class",
order_columns_by = "auto"
)
#> ℹ Ordering columns based on mutual information with [Class]
#> ℹ Making plot interactive since `interactive = TRUE`
ggparallel(
data = minibeans,
col_colour = "Class",
highlight = "DERMASON",
order_columns_by = "auto"
)
#> ℹ Ordering columns based on how well they differentiate 1 group from the rest [DERMASON] (based on mutual information)
#> ℹ Making plot interactive since `interactive = TRUE`