Skip to contents

Computes the Shannon diversity index for a sigverse signature. This metric quantifies the entropy or uncertainty associated with the distribution of mutation contexts in a signature, based on the relative fraction of mutations in each context.

Usage

sig_shannon(signature, exponentiate = FALSE)

Arguments

signature

A sigverse signature data.frame. See sigshared::example_signature().

exponentiate

Logical. If TRUE, returns the exponentiated Shannon index (effective number of contexts).

Value

A numeric value: either the Shannon index (entropy) or the exponentiated index (effective diversity).

Details

By default, the function returns the Shannon index as an entropy value. If exponentiate = TRUE, the function returns the exponentiated Shannon index, also known as the effective number of contexts (or Hill number of order 1). This makes interpretation more intuitive:

  • A signature concentrated entirely in a single context has an exponentiated index of 1.

  • A perfectly uniform signature (equal weight across all 96 SBS contexts) has an exponentiated index of 96.

In biological terms, the exponentiated Shannon index answers: "How many equally frequent mutation contexts would give this level of diversity?"

Examples

library(sigstash)
signatures <- sig_load("COSMIC_v3.3.1_SBS_GRCh38")
sbs3 <- signatures[["SBS3"]]
sbs48 <- signatures[["SBS48"]]

# Shannon entropy
sig_shannon(sbs3)
#> [1] 4.385754

# Exponentiated Shannon index (effective # of active contexts)
sig_shannon(sbs3, exponentiate = TRUE)
#> [1] 80.29877

# Compare with a highly focused signature
sig_shannon(sbs48, exponentiate = TRUE)
#> [1] 2.681695