CRAN status rstudio mirror downloads rstudio mirror downloads

🧭 Similarity and Distance Quantification between Probability Functions

Describe and understand the world through data.

Data collection and data comparison are the foundations of scientific research.
Mathematics provides the abstract framework to describe patterns we observe in nature and Statistics provides the framework to quantify the uncertainty of these patterns.

In statistics, natural patterns are described in the form of probability distributions that either follow fixed patterns (parametric distributions) or more dynamic ones (non-parametric distributions).

The philentropy package implements fundamental distance and similarity measures to quantify distances between probability density functions as well as traditional information theory measures.
In this regard, it aims to provide a framework for comparing natural patterns in a statistical notation.

🧡 This project is born out of my passion for statistics and I hope it will be useful to those who share it with me.


⚙️ Installation

# install philentropy version 0.10.0 from CRAN
install.packages("philentropy")

Or get the latest developer version:

# install.packages("devtools")
library(devtools)
install_github("HajkD/philentropy", build_vignettes = TRUE, dependencies = TRUE)

🧾 Citation

HG Drost (2018).
Philentropy: Information Theory and Distance Quantification with R.
Journal of Open Source Software, 3(26), 765.
https://doi.org/10.21105/joss.00765

🪶 I am developing philentropy in my spare time and would be very grateful if you would consider citing the paper above if it was useful for your research. These citations help me continue maintaining and extending the package.


🧩 Quick Start

library(philentropy)

P <- c(0.1, 0.2, 0.7)
Q <- c(0.2, 0.2, 0.6)

distance(rbind(P, Q), method = "jensen-shannon")
jensen-shannon using unit 'log'.
jensen-shannon
    0.02628933

💡 Tip: Got a large matrix (rows = samples, cols = features)?
Use distance(X, method="cosine", mute.message=TRUE) to compute the full pairwise matrix quickly and quietly.


🧪 When should I use which distance?

Goal Recommended Methods
🔁 Clustering / similarity cosine, correlation, euclidean
📊 Probability or compositional data jensen-shannon, hellinger, kullback-leibler
🧬 Sparse counts / binary canberra, jaccard, sorensen
⚖️ Scale-invariant manhattan, chebyshev

Run getDistMethods() to explore all 45+ implemented measures.


🧮 Examples

[1] "euclidean"         "manhattan"         "minkowski"         "chebyshev"         "sorensen"
[6] "gower"             "soergel"           "kulczynski_d"      "canberra"          "lorentzian"
[11] "intersection"      "non-intersection"  "wavehedges"        "czekanowski"       "motyka"
[16] "kulczynski_s"      "tanimoto"          "ruzicka"           "inner_product"     "harmonic_mean"
[21] "cosine"            "hassebrook"        "jaccard"           "dice"              "fidelity"
[26] "bhattacharyya"     "hellinger"         "matusita"          "squared_chord"     "squared_euclidean"
[31] "pearson"           "neyman"            "squared_chi"       "prob_symm"         "divergence"
[36] "clark"             "additive_symm"     "kullback-leibler"  "jeffreys"          "k_divergence"
[41] "topsoe"            "jensen-shannon"    "jensen_difference" "taneja"            "kumar-johnson"
[46] "avg"
# define probability density functions P and Q
P <- 1:10/sum(1:10)
Q <- 20:29/sum(20:29)

x <- rbind(P, Q)
philentropy::distance(x, method = "jensen-shannon")
jensen-shannon using unit 'log'.
jensen-shannon
    0.02628933

Alternatively, compute all available distances:

philentropy::dist.diversity(x, p = 2, unit = "log2")

🌟 Papers using philentropy (highlights)

Flagship examples with top venues. Click to expand full lists.

Nature / Cell / Science
  • A transcriptomic hourglass in brown algae
    JS Lotharukpong, M Zheng, R Luthringer et al. – Nature, 2024
  • Annelid functional genomics reveal the origins of bilaterian life cycles
    FM Martín-Zamora, Y Liang, K Guynes et al. – Nature, 2023
  • An atlas of gene regulatory elements in adult mouse cerebrum
    YE Li, S Preissl, X Hou, Z Zhang, K Zhang et al. – Nature, 2021
  • Convergent somatic mutations in metabolism genes in chronic liver disease
    S Ng, F Rouhani, S Brunner, N Brzozowska et al. – Nature, 2021
  • Antigen dominance hierarchies shape TCF1+ progenitor CD8 T cell phenotypes in tumors
    ML Burger, AM Cruz, GE Crossland et al. – Cell, 2021
  • A comparative atlas of single-cell chromatin accessibility in the human brain
    YE Li, S Preissl, M Miller, ND Johnson, Z Wang et al. – Science, 2023
Nature Methods / Nat Comms / Cell family
  • sciCSR infers B cell state transition and predicts class-switch recombination dynamics using scRNA-seq
    JCF Ng, G Montamat Garcia, AT Stewart et al. – Nature Methods, 2024
  • Decoding the gene regulatory network of endosperm differentiation in maize
    Y Yuan, Q Huo, Z Zhang, Q Wang, J Wang et al. – Nature Communications, 2024
  • Population structure in a fungal human pathogen is potentially linked to pathogenicity
    EA Hatmaker, AE Barber, MT Drott et al. – Nature Communications, 2025
  • Pan-cancer human brain metastases atlas at single-cell resolution
    X Xing, J Zhong, J Biermann, H Duan, X Zhang et al. – Cancer Cell, 2025
  • Gene module reconstruction identifies cellular differentiation processes and the regulatory logic of specialized secretion in zebrafish
    Y Wang, J Liu, LY Du, JL Wyss, JA Farrell, AF Schier – Developmental Cell, 2025
Other disciplines (selected)
  • Staphylococci in high resolution: Capturing diversity within the human nasal microbiota
    AC Ingham, DYK Ng, S Iversen, CM Liu et al. – Cell Reports, 2025
  • The power of visualizing distributional differences: formal graphical n-sample tests
    K Konstantinou, T Mrkvička, M Myllymäki – Computational Statistics, 2025
  • Plant species as ecological engineers of microtopography in a temperate sedge-grass marsh
    J Dušek, J Novotný, B Navrátilová et al. – Scientific Reports, 2025
  • Resolution of MALDI-TOF vs WGS for Bacillus identification (NASA JSC)
    F Mazhari, AB Regberg, CL Castro, MG LaMontagne – Frontiers in Microbiology, 2025
  • Every Hue Has Its Fan Club: Diverse Patterns of Color-Dependent Flower Visitation across Lepidoptera
    D Kutcherov, EL Westerman – Integrative and Comparative Biology, 2025

🎓 philentropy has been used in dozens of peer-reviewed publications to quantify distances, divergences, and similarities in complex biological and computational datasets.


🧠 Important Functions

Distance Measures

Information Theory

  • H() – Shannon’s Entropy H(X)
  • JE() – Joint Entropy H(X,Y)
  • CE() – Conditional Entropy H(X|Y)
  • MI() – Mutual Information I(X,Y)
  • KL() – Kullback–Leibler Divergence
  • JSD() – Jensen–Shannon Divergence
  • gJSD() – Generalized Jensen–Shannon Divergence

🗞️ NEWS

Find the current status and version history in the
👉 NEWS section.


🧩 Appendix — full references

  • A transcriptomic hourglass in brown algae
    JS Lotharukpong, M Zheng, R Luthringer et al. – Nature, 2024

  • Annelid functional genomics reveal the origins of bilaterian life cycles
    FM Martín-Zamora, Y Liang, K Guynes et al. – Nature, 2023

  • An atlas of gene regulatory elements in adult mouse cerebrum
    YE Li, S Preissl, X Hou, Z Zhang, K Zhang et al. – Nature, 2021

  • Convergent somatic mutations in metabolism genes in chronic liver disease
    S Ng, F Rouhani, S Brunner, N Brzozowska et al. – Nature, 2021

  • Antigen dominance hierarchies shape TCF1+ progenitor CD8 T cell phenotypes in tumors
    ML Burger, AM Cruz, GE Crossland et al. – Cell, 2021

  • High-content single-cell combinatorial indexing
    R Mulqueen et al. – Nature Biotechnology, 2021

  • A comparative atlas of single-cell chromatin accessibility in the human brain
    YE Li, S Preissl, M Miller, ND Johnson, Z Wang et al. – Science, 2023

  • Extinction at the end-Cretaceous and the origin of modern Neotropical rainforests
    MR Carvalho, C Jaramillo et al. – Science, 2021

  • sciCSR infers B cell state transition and predicts class-switch recombination dynamics using single-cell transcriptomic data
    JCF Ng, G Montamat Garcia, AT Stewart et al. – Nature Methods, 2024

  • HERMES: a molecular-formula-oriented method to target the metabolome
    R Giné, J Capellades, JM Badia et al. – Nature Methods, 2021

  • Epithelial zonation along the mouse and human small intestine defines five discrete metabolic domains
    RK Zwick, P Kasparek, B Palikuqi et al. – Nature Cell Biology, 2024

  • The genetic architecture of temperature adaptation is shaped by population ancestry and not by selection regime
    KA Otte, V Nolte, F Mallard et al. – Genome Biology, 2021

  • The Tug1 lncRNA locus is essential for male fertility
    JP Lewandowski et al. – Genome Biology, 2020

  • Decoding the gene regulatory network of endosperm differentiation in maize
    Y Yuan, Q Huo, Z Zhang, Q Wang, J Wang et al. – Nature Communications, 2024

  • A full-body transcription factor expression atlas with completely resolved cell identities in C. elegans
    Y Li, S Chen, W Liu, D Zhao, Y Gao, S Hu, H Liu, Y Li et al. – Nature Communications, 2024

  • Comprehensive mapping and modelling of the rice regulome landscape unveils the regulatory architecture underlying complex traits
    T Zhu, C Xia, R Yu, X Zhou, X Xu, L Wang et al. – Nature Communications, 2024

  • Transcriptional vulnerabilities of striatal neurons in human and rodent models of Huntington’s disease
    A Matsushima, SS Pineda, JR Crittenden et al. – Nature Communications, 2023

  • Population structure in a fungal human pathogen is potentially linked to pathogenicity
    EA Hatmaker, AE Barber, MT Drott et al. – Nature Communications, 2025

  • Resolving the structure of phage–bacteria interactions in the context of natural diversity
    KM Kauffman, WK Chang, JM Brown et al. – Nature Communications, 2022

  • Gut microbiome-mediated metabolism effects on immunity in rural and urban African populations
    M Stražar, GS Temba, H Vlamakis et al. – Nature Communications, 2021

  • Aging, inflammation and DNA damage in the somatic testicular niche with idiopathic germ cell aplasia
    M Alfano, AS Tascini, F Pederzoli et al. – Nature Communications, 2021

  • Single cell census of human kidney organoids shows reproducibility and diminished off-target cells after transplantation
    A Subramanian et al. – Nature Communications, 2019

  • Pan-cancer human brain metastases atlas at single-cell resolution
    X Xing, J Zhong, J Biermann, H Duan, X Zhang et al. – Cancer Cell, 2025

  • The temporal progression of lung immune remodeling during breast cancer metastasis
    CS McGinnis, Z Miao, D Superville, W Yao et al. – Cancer Cell, 2024

  • Cross-tissue human fibroblast atlas reveals myofibroblast subtypes with distinct roles in immune modulation
    Y Gao, J Li, W Cheng, T Diao, H Liu, Y Bo, C Liu et al. – Cancer Cell, 2024

  • Gene module reconstruction identifies cellular differentiation processes and the regulatory logic of specialized secretion in zebrafish
    Y Wang, J Liu, LY Du, JL Wyss, JA Farrell, AF Schier – Developmental Cell, 2025

  • Large-scale chromatin reorganization reactivates placenta-specific genes that drive cellular aging
    Z Liu, Q Ji, J Ren, P Yan, Z Wu, S Wang, L Sun, Z Wang et al. – Developmental Cell, 2022

  • Integrated single-cell and spatial transcriptomic profiling reveals that CD177+ Tregs enhance immunosuppression through apoptosis and resistance to …
    Y Liang, L Qiao, Q Qian, R Zhang, Y Li, X Xu et al. – Oncogene, 2025

  • Conserved and unique features of terminal telomeric sequences in ALT-positive cancer cells
    B Azeroglu, W Wu, R Pavani, RS Sandhu et al. – eLife, 2025

  • Spotless, a reproducible pipeline for benchmarking cell type deconvolution in spatial transcriptomics
    C Sang-Aram, R Browaeys, R Seurinck, Y Saeys – eLife, 2024

  • Loss of adaptive capacity in asthmatic patients revealed by biomarker fluctuation dynamics after rhinovirus challenge
    A Sinha et al. – eLife, 2019

  • Staphylococci in high resolution: Capturing diversity within the human nasal microbiota
    AC Ingham, DYK Ng, S Iversen, CM Liu et al. – Cell Reports, 2025

  • Triple network dynamics and future alcohol consumption in adolescents
    CC McIntyre, M Khodaei, RG Lyday et al. – Alcohol: Clinical and Experimental Research, 2025

  • Benchmarking 13 tools for mutational signature attribution, including a new and improved algorithm
    N Jiang, Y Wu, SG Rozen – Briefings in Bioinformatics, 2025

  • Plant species as ecological engineers of microtopography in a temperate sedge-grass marsh
    J Dušek, J Novotný, B Navrátilová et al. – Scientific Reports, 2025

  • Resolution of MALDI-TOF compared to whole genome sequencing for identification of Bacillus species isolated from cleanrooms at NASA Johnson Space Center
    F Mazhari, AB Regberg, CL Castro, MG LaMontagne – Frontiers in Microbiology, 2025

  • Every Hue Has Its Fan Club: Diverse Patterns of Color-Dependent Flower Visitation across Lepidoptera
    D Kutcherov, EL Westerman – Integrative and Comparative Biology, 2025

  • An in vivo CRISPR screen in chick embryos reveals a role for MLLT3 in specification of neural cells from the caudal epiblast
    ARG Libby, T Rito, A Radley, J Briscoe – Development, 2025

  • Single-Cell Analyses Reveal a Functionally Heterogeneous Exhausted CD8+ T-cell Subpopulation That Is Correlated with Response to Checkpoint Therapy in …
    KM Mahuron, O Shahid, P Sao, C Wu et al. – Cancer Research, 2025

  • ETS1‐Driven Nucleolar Stress Orchestrates OLR1+ Macrophage Crosstalk to Sustain Immunosuppressive Microenvironment in Clear Cell Renal Cell Carcinoma
    L Xiao, Z Zhang, T Li, Y Jiang, Y Liu, J Wang, W Tang – Human Mutation, 2025

  • Association Between Ocular Microbiomes of Children and Their Siblings and Parents
    X Ling, Y Zhang, CHT Bui, HN Chan, POS Tam et al. – Investigative Ophthalmology & Visual Science, 2025

  • Benefits and challenges of host depletion methods in profiling the upper and lower respiratory microbiome
    C Wang, L Zhang, C Kan, J He, W Liang, R Xia et al. – Biofilms and Microbiomes, 2025

  • Unsettled Times: Music Discovery Reveals Divergent Cultural Responses to War
    H Lee, M Anglada-Tort, O Sobchuk et al. – PsyArXiv Preprints, 2025

  • q-Generalization of Nakagami distribution with applications
    N Kumar, A Dixit, V Vijay – Japanese Journal of Statistics and Data Science, 2025

  • The power of visualizing distributional differences: formal graphical n-sample tests
    K Konstantinou, T Mrkvička, M Myllymäki – Computational Statistics, 2025

  • Topic Modeling of Positive and Negative Reviews of Soulslike Video Games
    T Guzsvinecz – Computers, 2025

  • Basic Statistical Inference
    M Nguyen – Foundations of Data Analysis, 2025