Skip to contents

Performs permutation testing where BOTH observed and permuted data get to optimize sigma selection. This is the statistically correct approach that addresses inflated Type I error caused by sigma selection being applied only to observed data.

Usage

runSkrCCAPermu_FairSigma(
  object,
  nPermu = 100,
  sigma_values = NULL,
  permu_method = "bin",
  permu_which = "second_only",
  num_bins_x = 10,
  num_bins_y = 10,
  match_quantile = FALSE,
  maxIter = 200,
  tol = 1e-05,
  n_cores = 1,
  verbose = TRUE
)

Arguments

object

A CoPro object with CCA already computed via runSkrCCA() and normalized correlation computed via computeNormalizedCorrelation()

nPermu

Number of permutations to run (default: 100)

sigma_values

Vector of sigma values to test. If NULL, uses all sigma values from the original analysis (object@sigmaValues)

permu_method

Method of permutation: "bin", "global", "pc", or "toroidal"

permu_which

Which cell types to permute: "second_only", "both", "first_only"

num_bins_x

Number of bins in x for bin-wise permutation

num_bins_y

Number of bins in y for bin-wise permutation

match_quantile

Whether to use quantile matching for bin permutation

maxIter

Maximum iterations for CCA optimization

tol

Convergence tolerance

n_cores

Number of cores for parallel computation (not yet implemented)

verbose

Whether to print progress messages

Value

CoPro object with fair permutation results stored in:

  • @skrCCAPermuOut: Best weights for each permutation

  • @normalizedCorrelationPermu: Best ncorr for each permutation

  • @fairSigmaPermu: List with sigma selected for each permutation

Details

The Sigma Selection Problem

In standard CoPro analysis, the observed data gets to choose the best sigma (the one maximizing normalized correlation). However, permutation data uses this SAME sigma, which may not be optimal for permuted data. This asymmetry can inflate Type I error.

The Solution

This function runs CCA at EACH sigma value for EACH permutation, then selects the best sigma for that permutation. Both observed and permuted data thus have equal opportunity to optimize sigma selection.

Computational Cost

This is more computationally expensive (nPermu * nSigma CCA runs instead of nPermu runs), but provides statistically correct p-values.

Examples

if (FALSE) { # \dontrun{
# After running standard CoPro analysis
br <- runSkrCCA(br, scalePCs = TRUE)
br <- computeNormalizedCorrelation(br)

# Run fair sigma permutation test
br <- runSkrCCAPermu_FairSigma(br, nPermu = 100,
                                permu_method = "toroidal")

# Calculate p-value
result <- calculate_pvalue(br)
} # }