This function takes four types of matrices as input which were generated with the same network inference tool (e.g. GENIE3), but had different input data to the network inference tool in the context of noise-filtering and quantile-normalization. It is assumed that the four matrices inserted into this function come from network inference runs with the following input data specifications:

  • Raw mapped RNAseq count data without noise-filtering or quantile-normalization applied (adj_mat_not_filtered_not_normalized)

  • Raw mapped RNAseq count data with noise-filtering applied but no quantile-normalization (adj_mat_filtered_and_not_normalized)

  • Raw mapped RNAseq count data without noise-filtering but with quantile-normalization applied (adj_mat_not_filtered_but_normalized)

  • Raw mapped RNAseq count data with noise-filtering applied and with quantile-normalization applied (adj_mat_filtered_and_normalized)

All relevant pairwise comparisons are then performed internally with network_dist_pairwise_genes based on gene-wise Hamming Distance or Jaccard Similarity Coefficient computations.

network_benchmark_noise_filtering(
  adj_mat_not_filtered_not_normalized,
  adj_mat_filtered_and_not_normalized,
  adj_mat_not_filtered_but_normalized,
  adj_mat_filtered_and_normalized,
  grn_tool = NA,
  threshold = "median",
  dist_type = "hamming",
  print_message = TRUE
)

Arguments

adj_mat_not_filtered_not_normalized

a weighted adjacency matrix derived from a network inference program where no noise-filtering or quantile-normalization was applied to the input data.

adj_mat_filtered_and_not_normalized

a weighted adjacency matrix derived from a network inference program where noise-filtering but no quantile-normalization was applied to the input data.

adj_mat_not_filtered_but_normalized

a weighted adjacency matrix derived from a network inference program where no noise-filtering but quantile-normalization was applied to the input data.

adj_mat_filtered_and_normalized

a weighted adjacency matrix derived from a network inference program where noise-filtering and quantile-normalization were applied to the input data.

grn_tool

a character string specifying the gene regulatory network inference tool that was used to generate input matrices. Default is grn_tool = NA.

threshold

we recommended to use network_rescale before using this function. Re-scaling will transform all values into a range [0,100]. The threshold can either be a numeric balue in the interval [0,100] or a character string specifying the following methods for automatically determining the threshold based on the input data:

  • threshold = "median": compute the median over the entire input adj_mat and use this median value as threshold for defining all edge weights of a genes equal or below the median value as 0 and all values above the median value as 1.

See network_make_binary for details.

dist_type

a distance method that shall be applied on the binary values for each gene. Available options are:

  • dist_type = "hamming": computes the hamming.distance for each gene between the two input matrices

  • dist_type = "jaccard": computes the jaccard for each gene between the two input matrices

See network_dist_pairwise_genes for details.

print_message

shall messages be printed?

Author

Hajk-Georg Drost

Examples

# Benchmark GENIE3 inferred networks with raw, no_noise, and quantile_norm combinations genie3_49_raw <- as.matrix(read.csv( system.file("data/network_raw_49_placenta_development.csv", package = "edgynode"), row.names = 1)) genie3_49_noNoiseCM_raw <- as.matrix(read.csv( system.file("data/network_noNoiseCM_raw_49_placenta_development.csv", package = "edgynode"), row.names = 1)) genie3_49_qnorm_no_noise_removed <- as.matrix(read.csv( system.file("data/network_qnorm_49_placenta_development.csv", package = "edgynode"), row.names = 1)) genie3_49_noNoiseCM_qnorm <- as.matrix(read.csv( system.file("data/network_noNoiseCM_qnorm_49_placenta_development.csv", package = "edgynode"), row.names = 1)) # Run Benchmark using Hamming distance benchmark_hamming <- network_benchmark_noise_filtering( genie3_49_raw, genie3_49_noNoiseCM_raw, genie3_49_qnorm_no_noise_removed, genie3_49_noNoiseCM_qnorm, dist_type = "hamming", grn_tool = "GENIE3")
#> Warning: The matrix provided as input for network_rescale() was coerced into symmetric.
#> network_make_binary() applies the median value over all values in the input matrix an uses [11.27] as cut-off threshold to transform the input weighted adjacency matrix into a binary adjacency matrix.
#> Warning: The matrix provided as input for network_rescale() was coerced into symmetric.
#> network_make_binary() applies the median value over all values in the input matrix an uses [9.95] as cut-off threshold to transform the input weighted adjacency matrix into a binary adjacency matrix.
#> Warning: The matrix provided as input for network_rescale() was coerced into symmetric.
#> network_make_binary() applies the median value over all values in the input matrix an uses [10.75] as cut-off threshold to transform the input weighted adjacency matrix into a binary adjacency matrix.
#> Warning: The matrix provided as input for network_rescale() was coerced into symmetric.
#> network_make_binary() applies the median value over all values in the input matrix an uses [11.04] as cut-off threshold to transform the input weighted adjacency matrix into a binary adjacency matrix.
#>
#> Comparison 1: 'adj_mat_not_filtered_not_normalized' vs 'adj_mat_filtered_and_not_normalized'
#> - adj_mat_qry: nrow = (46) and ncol(46)
#> - adj_mat_sbj: nrow = (46) and ncol(46)
#> The Hamming Distances for each gene between the two input matrices are computed.
#>
#> Comparison 2: 'adj_mat_not_filtered_not_normalized' vs 'adj_mat_not_filtered_but_normalized'
#> - adj_mat_qry: nrow = (46) and ncol(46)
#> - adj_mat_sbj: nrow = (46) and ncol(46)
#> The Hamming Distances for each gene between the two input matrices are computed.
#>
#> Comparison 3: 'adj_mat_not_filtered_not_normalized' vs 'adj_mat_filtered_and_normalized'
#> - adj_mat_qry: nrow = (46) and ncol(46)
#> - adj_mat_sbj: nrow = (46) and ncol(46)
#> The Hamming Distances for each gene between the two input matrices are computed.
#>
#> Comparison 4: 'adj_mat_not_filtered_but_normalized' vs 'adj_mat_filtered_and_normalized'
#> - adj_mat_qry: nrow = (46) and ncol(46)
#> - adj_mat_sbj: nrow = (46) and ncol(46)
#> The Hamming Distances for each gene between the two input matrices are computed.
#> - adj_mat_qry: nrow = (46) and ncol(46)
#> - adj_mat_sbj: nrow = (46) and ncol(46)
#> The Hamming Distances for each gene between the two input matrices are computed.
#> - adj_mat_qry: nrow = (46) and ncol(46)
#> - adj_mat_sbj: nrow = (46) and ncol(46)
#> The Hamming Distances for each gene between the two input matrices are computed.
# look at results benchmark_hamming
#> # A tibble: 46 x 8 #> grn_tool genes `-F -N / +F -N` `-F -N / -F +N` `-F -N / +F +N` #> <chr> <chr> <dbl> <dbl> <dbl> #> 1 GENIE3 ENSM… 2 4 7 #> 2 GENIE3 ENSM… 10 11 14 #> 3 GENIE3 ENSM… 1 6 9 #> 4 GENIE3 ENSM… 3 4 6 #> 5 GENIE3 ENSM… 2 6 7 #> 6 GENIE3 ENSM… 2 5 6 #> 7 GENIE3 ENSM… 7 10 14 #> 8 GENIE3 ENSM… 1 8 9 #> 9 GENIE3 ENSM… 4 5 4 #> 10 GENIE3 ENSM… 4 1 1 #> # … with 36 more rows, and 3 more variables: `-F +N / +F +N` <dbl>, `-F +N / +F #> # -N` <dbl>, `+F +N / +F -N` <dbl>
# Run Benchmark using Jaccard Coefficients benchmark_jaccard <- network_benchmark_noise_filtering( genie3_49_raw, genie3_49_noNoiseCM_raw, genie3_49_qnorm_no_noise_removed, genie3_49_noNoiseCM_qnorm, dist_type = "jaccard", grn_tool = "GENIE3")
#> Warning: The matrix provided as input for network_rescale() was coerced into symmetric.
#> network_make_binary() applies the median value over all values in the input matrix an uses [11.27] as cut-off threshold to transform the input weighted adjacency matrix into a binary adjacency matrix.
#> Warning: The matrix provided as input for network_rescale() was coerced into symmetric.
#> network_make_binary() applies the median value over all values in the input matrix an uses [9.95] as cut-off threshold to transform the input weighted adjacency matrix into a binary adjacency matrix.
#> Warning: The matrix provided as input for network_rescale() was coerced into symmetric.
#> network_make_binary() applies the median value over all values in the input matrix an uses [10.75] as cut-off threshold to transform the input weighted adjacency matrix into a binary adjacency matrix.
#> Warning: The matrix provided as input for network_rescale() was coerced into symmetric.
#> network_make_binary() applies the median value over all values in the input matrix an uses [11.04] as cut-off threshold to transform the input weighted adjacency matrix into a binary adjacency matrix.
#>
#> Comparison 1: 'adj_mat_not_filtered_not_normalized' vs 'adj_mat_filtered_and_not_normalized'
#> - adj_mat_qry: nrow = (46) and ncol(46)
#> - adj_mat_sbj: nrow = (46) and ncol(46)
#> The Jaccard Similarity Coefficients for each gene between the two input matrices are computed.
#>
#> Comparison 2: 'adj_mat_not_filtered_not_normalized' vs 'adj_mat_not_filtered_but_normalized'
#> - adj_mat_qry: nrow = (46) and ncol(46)
#> - adj_mat_sbj: nrow = (46) and ncol(46)
#> The Jaccard Similarity Coefficients for each gene between the two input matrices are computed.
#>
#> Comparison 3: 'adj_mat_not_filtered_not_normalized' vs 'adj_mat_filtered_and_normalized'
#> - adj_mat_qry: nrow = (46) and ncol(46)
#> - adj_mat_sbj: nrow = (46) and ncol(46)
#> The Jaccard Similarity Coefficients for each gene between the two input matrices are computed.
#>
#> Comparison 4: 'adj_mat_not_filtered_but_normalized' vs 'adj_mat_filtered_and_normalized'
#> - adj_mat_qry: nrow = (46) and ncol(46)
#> - adj_mat_sbj: nrow = (46) and ncol(46)
#> The Jaccard Similarity Coefficients for each gene between the two input matrices are computed.
#> - adj_mat_qry: nrow = (46) and ncol(46)
#> - adj_mat_sbj: nrow = (46) and ncol(46)
#> The Jaccard Similarity Coefficients for each gene between the two input matrices are computed.
#> - adj_mat_qry: nrow = (46) and ncol(46)
#> - adj_mat_sbj: nrow = (46) and ncol(46)
#> The Jaccard Similarity Coefficients for each gene between the two input matrices are computed.
# look at results benchmark_jaccard
#> # A tibble: 46 x 8 #> grn_tool genes `-F -N / +F -N` `-F -N / -F +N` `-F -N / +F +N` #> <chr> <chr> <dbl> <dbl> <dbl> #> 1 GENIE3 ENSM… 0.92 0.84 0.741 #> 2 GENIE3 ENSM… 0.643 0.577 0.5 #> 3 GENIE3 ENSM… 0.955 0.76 0.625 #> 4 GENIE3 ENSM… 0.842 0.810 0.714 #> 5 GENIE3 ENSM… 0.905 0.75 0.72 #> 6 GENIE3 ENSM… 0.929 0.833 0.793 #> 7 GENIE3 ENSM… 0.75 0.630 0.481 #> 8 GENIE3 ENSM… 0.96 0.704 0.667 #> 9 GENIE3 ENSM… 0.846 0.808 0.852 #> 10 GENIE3 ENSM… 0.818 0.947 0.947 #> # … with 36 more rows, and 3 more variables: `-F +N / +F +N` <dbl>, `-F +N / +F #> # -N` <dbl>, `+F +N / +F -N` <dbl>