Calculates the Lp distance (also known as the Minkowski distance)
between two sigverse
signatures or catalogues. This generalizes various distance
metrics depending on the choice of p
.
Usage
sig_lp_distance(
signature1,
signature2,
p,
value = c("fraction", "count"),
scale = FALSE
)
Arguments
- signature1, signature2
Two
sigverse
signature or catalogue data.frames.- p
A numeric value ≥ 0 indicating the order of the Lp norm to compute.
p = 0
: counts the number of non-zero entries (not a true norm; useful for sparsity).p = 1
: Manhattan distance (sum of absolute differences).p = 2
: Euclidean (L2) distance.p = ∞
: Chebyshev distance (maximum absolute difference).
- value
Either
"fraction"
(default) or"count"
— determines which column is used for comparison.- scale
Logical. If
TRUE
, distance is divided by the number of contexts (i.e. channels).
Details
This function is useful for flexible distance computations when comparing mutational signatures or catalogues. All channels must match and be in the same order.
By default, distances are computed using raw values. If scale = TRUE
, the distance is
divided by the number of mutation contexts to allow comparisons across different signature types
(e.g. SBS vs DBS).
Examples
library(sigstash)
sigs <- sig_load("COSMIC_v3.3.1_SBS_GRCh38")
sig_lp_distance(sigs[["SBS1"]], sigs[["SBS5"]], p = 1) # L1 (Manhattan)
#> [1] 1.677304
sig_lp_distance(sigs[["SBS1"]], sigs[["SBS5"]], p = 2) # L2 (Euclidean)
#> [1] 0.4766606
sig_lp_distance(sigs[["SBS1"]], sigs[["SBS5"]], p = 1, scale = TRUE)
#> [1] 0.01747191