Version 0.6.0

New Features

  • distance() and all other individual information theory functions receive a new argument epsilon with default value epsilon = 0.00001 to treat cases where in individual distance or similarity computations yield x / 0 or 0 / 0. Instead of a hard coded epsilon, users can now set epsilon according to their input vectors. (Many thanks to Joshua McNeill #26 for this great question).
  • three new functions dist_one_one(), dist_one_many(), dist_many_many() are added. They are fairly flexible intermediaries between distance() and single distance functions. dist_one_one() expects two vectors (probability density functions) and returns a single value. dist_one_many() expects one vector (a probability density function) and one matrix (a set of probability density functions), and returns a vector of values. dist_many_many() expects two matrices (two sets of probability density functions), and returns a matrix of values. (Many thanks to Jakub Nowosad, see #27, #28, and New Vignette Many_Distance)

Updates


library(philentropy)

m1 = matrix(c(1, 2), ncol = 1)

dist(m1)
#> 1
#> 2 1
distance(m1, as.dist.obj = TRUE)
#> Metric: 'euclidean'; comparing: 2 vectors.
#> 1
#> 2 1

Version 0.5.0

New Features

  • the distance() function receives a new argument mute.message allowing users to mute message printing when running large-scale distance computations. Example:

distance(rbind(1:10/sum(1:10), 20:29/sum(20:29)), 
         method = "euclidean", 
         mute.message = TRUE)

Version 0.4.0

New Features

Before:


ProbMatrix <- rbind(1:10/sum(1:10), 20:29/sum(20:29),30:39/sum(30:39))
dist.mat <- distance(ProbMatrix, method = "jaccard")
true.dist.mat <- as.dist(dist.mat)
clust.res <- hclust(true.dist.mat, method = "complete")
clust.res
Call:
hclust(d = true.dist.mat, method = "complete")

Cluster method   : complete 
Number of objects: 3 

Now:


ProbMatrix <- rbind(1:10/sum(1:10), 20:29/sum(20:29),30:39/sum(30:39))
dist.mat <- distance(ProbMatrix, method = "jaccard", as.dist.obj = TRUE)
clust.res <- hclust(true.dist.mat, method = "complete")
clust.res
Call:
hclust(d = true.dist.mat, method = "complete")

Cluster method   : complete 
Number of objects: 3 

Bug fixes

  • fixing a bug in gJSD() which tested transposed matrix rows rather than transposed matrix columns for sum > 1 (see issue #17 ; many thanks to @wkc1986)

Version 0.3.0

New functionality

Bug fixes

  • fixing bug which caused that KL distance returns NaN when P == 0 (see issue #10; Many thanks to @KaiserDominici)

  • fixing bug which caused stack overflow when computing distance matrices with many rows (see issue #7; Many thanks to @wkc1986 and @elbamos)

  • fixing bug in gJSD() where an rbind() input matrix is not properly transposed (Many thanks to @vrodriguezf; see issue #14)

New Features

  • gJSD() receives new argument est.prob to enable empirical estimation of probability vectors from input count vectors (non-probabilistic vectors)

  • Jaccard and Tanimoto similarity measures now return 0 instead of NAN when probability vectors contain zeros (Many thanks to @JonasMandel; see issue #15)

Version 0.2.0

Bug fixes

  • Fixing bug that caused jensen-shannon computations to compute wrong values when 0 values were present in the input vectors (see issue #4 ; Many thanks to @wkc1986)
  • Fixing bug that caused jensen-difference computations to compute wrong values when 0 values were present in the input vectors
  • Fixing bugs in all distance metrics when handing 0/0, 0/x or x/0 cases

Version 0.1.0

New Features

  • new message system
  • extending documentation

Bug fixes

Version 0.0.2

Bug fixes

Version 0.0.1

Initial submission version.