This function computes the Generalized Jensen-Shannon Divergence of a probability matrix.
gJSD(x, unit = "log2", weights = NULL, est.prob = NULL)
a probability matrix.
a character string specifying the logarithm unit that shall be used to compute distances that depend on log computations.
a numeric vector specifying the weights for each distribution in x
.
Default: weights
= NULL
; in this case all distributions are weighted equally (= uniform distribution of weights).
In case users wish to specify non-uniform weights for e.g. 3 distributions, they
can specify the argument weights = c(0.5, 0.25, 0.25)
. This notation
denotes that vec1
is weighted by 0.5
, vec2
is weighted by 0.25
, and vec3
is weighted by 0.25
as well.
method to estimate probabilities from input count vectors such as non-probability vectors. Default: est.prob = NULL
. Options are:
est.prob = "empirical"
: The relative frequencies of each vector are computed internally. For example an input matrix rbind(1:10, 11:20)
will be transformed to a probability vector rbind(1:10 / sum(1:10), 11:20 / sum(11:20))
The Jensen-Shannon divergence between all possible combinations of comparisons.
Function to compute the Generalized Jensen-Shannon Divergence
\(JSD_{\pi_1,...,\pi_n}(P_1, ..., P_n) = H(\sum_{i = 1}^n \pi_i * P_i) - \sum_{i = 1}^n \pi_i*H(P_i)\)
where \(\pi_1,...,\pi_n\) denote the weights selected for the probability vectors P_1,...,P_n
and H(P_i)
denotes the Shannon Entropy of probability vector P_i
.
# define input probability matrix
Prob <- rbind(1:10/sum(1:10), 20:29/sum(20:29), 30:39/sum(30:39))
# compute the Generalized JSD comparing the PS probability matrix
gJSD(Prob)
#> No weights were specified ('weights = NULL'), thus equal weights for all distributions will be calculated and applied.
#> Metric: 'gJSD'; unit = 'log2'; comparing: 3 vectors (v1, ... , v3).
#> Weights: v1 = 0.333333333333333, v2 = 0.333333333333333, v3 = 0.333333333333333
#> [1] 0.03512892
# Generalized Jensen-Shannon Divergence between three vectors using different log bases
gJSD(Prob, unit = "log2") # Default
#> No weights were specified ('weights = NULL'), thus equal weights for all distributions will be calculated and applied.
#> Metric: 'gJSD'; unit = 'log2'; comparing: 3 vectors (v1, ... , v3).
#> Weights: v1 = 0.333333333333333, v2 = 0.333333333333333, v3 = 0.333333333333333
#> [1] 0.03512892
gJSD(Prob, unit = "log")
#> No weights were specified ('weights = NULL'), thus equal weights for all distributions will be calculated and applied.
#> Metric: 'gJSD'; unit = 'log'; comparing: 3 vectors (v1, ... , v3).
#> Weights: v1 = 0.333333333333333, v2 = 0.333333333333333, v3 = 0.333333333333333
#> [1] 0.02434951
gJSD(Prob, unit = "log10")
#> No weights were specified ('weights = NULL'), thus equal weights for all distributions will be calculated and applied.
#> Metric: 'gJSD'; unit = 'log10'; comparing: 3 vectors (v1, ... , v3).
#> Weights: v1 = 0.333333333333333, v2 = 0.333333333333333, v3 = 0.333333333333333
#> [1] 0.01057486
# Jensen-Shannon Divergence Divergence between count vectors P.count and Q.count
P.count <- 1:10
Q.count <- 20:29
R.count <- 30:39
x.count <- rbind(P.count, Q.count, R.count)
gJSD(x.count, est.prob = "empirical")
#> No weights were specified ('weights = NULL'), thus equal weights for all distributions will be calculated and applied.
#> Metric: 'gJSD'; unit = 'log2'; comparing: 3 vectors (v1, ... , v3).
#> Weights: v1 = 0.333333333333333, v2 = 0.333333333333333, v3 = 0.333333333333333
#> [1] 0.03512892