netflow.methods.stats#

Functions

compute_pearson(row_x, row_y)

Compute Pearson correlation with each row in Y for a given row in X.

compute_pearson_parallel(X, Y[, ...])

Compute Pearson correlation between each row of X and all rows of Y in parallel.

compute_spearman(row_x, row_y)

Compute Spearman correlation with each row in Y for a given row in X.

compute_spearman_parallel(X, Y[, ...])

Compute Spearman correlation between each row of X and all rows of Y in parallel.

mann_whitney_u_test(values1, values2[, ...])

Perform the Mann-Whitney U rank test on two independent samples.

perform_stat_test(values1, values2, ...)

Choose and perform a statistical test, matching your original dispatch behavior.

perform_stat_test_matrix(X1, X2, *[, test, ...])

Compute per-feature p-values comparing two groups of samples.

sig_feats_by_group(groups, feats, *[, test, ...])

Compute per-group feature significance vs the rest of the cohort.

stat_test(df1, df2[, test, alpha, method, ...])

Perform statistical test between groups/datasets and apply multiple test correction.

t_test(values1, values2[, alternative, ...])

Calculate the T-test for the means of two independent samples of scores.

wilcoxon_signed_rank_test(values1[, ...])

The Wilcoxon signed-rank test.

netflow.methods.stats._apply_multipletests_safe(
pvals: ndarray,
*,
alpha: float,
method: str,
) ndarray[source]#

Apply multiple testing correction with stable behavior in presence of NaNs.

Parameters:
  • pvals (numpy.ndarray, shape (n_features,)) – Raw p-values. May include NaNs (e.g., insufficient samples, constant vectors).

  • alpha (float) – Family-wise error rate / FDR control level passed to statsmodels.

  • method (str) – Multipletests method name. See _validate_multitest_method.

Returns:

qvals – Corrected p-values. NaN p-values remain NaN in qvals.

Return type:

numpy.ndarray, shape (n_features,)

Notes

statsmodels.stats.multitest.multipletests does not always behave well with NaNs. This helper replaces NaNs with 1.0 for correction, then restores NaNs.

netflow.methods.stats._coerce_samples_x_features(
groups: Series,
feats: DataFrame,
*,
samples_axis: Literal['auto', 'index', 'columns'] = 'auto',
) Tuple[Series, DataFrame][source]#

Coerce and align inputs so feature data feats is shaped as (n_samples, n_features).

This helper ensures all downstream computations can assume:

  • X_df is a DataFrame with samples as rows and features as columns

  • X_df.index matches groups.index exactly (same sample IDs, same order)

It supports two common input conventions:

  1. feats provided as samples x features (samples on index)

  2. feats provided as features x samples (samples on columns), in which case it will be transposed.

Parameters:
  • groups (pandas.Series) – Group labels indexed by sample ID. The sample IDs are used to align groups and feats.

  • feats (pandas.DataFrame) –

    Feature matrix in one of two orientations:

    • samples x features (feats.index are sample IDs, feats.columns feature names)

    • features x samples (feats.columns are sample IDs, feats.index feature names)

  • samples_axis ({"auto", "index", "columns"}, default="auto") –

    Controls how alignment/orientation is determined.

    • ”auto”: infer orientation from whether groups.index matches feats.index or feats.columns.

    • ”index”: enforce that samples are stored on feats.index.

    • ”columns”: enforce that samples are stored on feats.columns.

Returns:

  • groups_aligned (pandas.Series) – groups reindexed to match the sample order of the returned feature matrix.

  • X_df (pandas.DataFrame) – Feature matrix oriented as samples x features, with:

    • X_df.index = sample IDs (same order as groups_aligned.index)

    • X_df.columns = feature names

Raises:
  • TypeError – If groups is not a Series or feats is not a DataFrame.

  • ValueError – If the function cannot determine a consistent alignment between groups and feats under the given samples_axis.

Notes

This function is intentionally strict: it does not silently drop samples. If indices do not align, it raises with a clear message rather than producing a subtly misaligned analysis.

netflow.methods.stats._coerce_two_dfs_samples_x_features(
df1: DataFrame,
df2: DataFrame,
*,
samples_axis: Literal['auto', 'index', 'columns'] = 'auto',
test: Literal['MWU', 't-test', 'wilcoxon'] = 'MWU',
) Tuple[DataFrame, DataFrame][source]#

Coerce and strictly align two DataFrames so both are shaped as (n_samples, n_features).

Parameters:
  • df1 (pandas.DataFrame) – Two matrices to compare. Each may be either: - samples x features (samples on index), or - features x samples (samples on columns)

  • df2 (pandas.DataFrame) – Two matrices to compare. Each may be either: - samples x features (samples on index), or - features x samples (samples on columns)

  • samples_axis ({"auto","index","columns"}, default="auto") –

    • “index”: enforce samples on index (samples x features)

    • ”columns”: enforce samples on columns (features x samples), then transpose

    • ”auto”: infer strictly from label agreement between df1 and df2:
      • if df1.columns == df2.columns and df1.index != df2.index -> samples_axis=”index”

      • if df1.index == df2.index and df1.columns != df2.columns -> samples_axis=”columns”

      • if both index and columns match -> ambiguous; choose by heuristic (smaller dimension treated as samples); if tie, raise.

      • if neither matches -> raise.

  • test ({"MWU","t-test","wilcoxon"}, default="MWU") – If “wilcoxon”, requires paired samples after coercion (same sample index).

Returns:

X1, X2 – Coerced matrices as samples x features, strictly aligned on features. If test=”wilcoxon”, also strictly aligned on samples.

Return type:

pandas.DataFrame

Raises:

ValueError – If strict alignment is not possible under the requested/inferred orientation.

netflow.methods.stats._feature_chunks(
n_features: int,
chunk_size: int,
) List[Tuple[int, int]][source]#

Partition a number of features into contiguous chunks (half-open intervals).

Parameters:
  • n_features (int) – Number of features.

  • chunk_size (int) – Chunk size (number of features per chunk). Must be positive.

Returns:

chunks – List of (start, end) chunk intervals covering [0, n_features).

Return type:

list of tuple(int, int)

Raises:

ValueError – If chunk_size <= 0.

netflow.methods.stats._mwu_mask_chunk(
task: Tuple[int, int, ndarray, ndarray],
) Tuple[int, ndarray][source]#

Compute MWU p-values for a chunk of feature columns given group/rest row indices.

Parameters:

task (tuple) – (start, end, g_idx, r_idx) where: - start/end define feature interval [start, end) - g_idx are sample indices in group - r_idx are sample indices in rest

Returns:

  • start (int) – Start feature index.

  • pvals_chunk (numpy.ndarray) – MWU p-values for features in [start, end).

Notes

This avoids materializing (n_group x n_features) and (n_rest x n_features) matrices. It only gathers the 1D vectors needed per feature.

netflow.methods.stats._mwu_mask_init_worker(
X: ndarray,
*,
alternative: str,
nan_policy: str,
mwu_kwargs: Dict[str, Any],
) None[source]#

Initializer for MWU group-vs-rest workers.

Parameters:
  • X (numpy.ndarray, shape (n_samples, n_features)) – Full matrix.

  • alternative ({"two-sided","less","greater"}) – Alternative hypothesis passed to SciPy.

  • nan_policy ({"omit","raise"}) – NaN handling.

  • mwu_kwargs (dict) – Extra keyword args forwarded to scipy.stats.mannwhitneyu.

Return type:

None

netflow.methods.stats._mwu_pvals_group_vs_rest(
X: ndarray,
g_mask: ndarray,
*,
alternative: Literal['two-sided', 'less', 'greater'] = 'two-sided',
nan_policy: Literal['omit', 'raise'] = 'omit',
n_jobs: int = 1,
parallel_backend: Literal['processes', 'threads', 'none'] = 'processes',
chunk_size_features: int = 256,
mwu_kwargs: Dict[str, Any] | None = None,
) ndarray[source]#

Compute MWU p-values for group vs rest without materializing submatrices.

Parameters:
  • X (numpy.ndarray, shape (n_samples, n_features)) – Full matrix (samples x features).

  • g_mask (numpy.ndarray, dtype=bool, shape (n_samples,)) – Boolean mask selecting group samples.

  • alternative ({"two-sided","less","greater"}, default="two-sided") – Alternative hypothesis for MWU.

  • nan_policy ({"omit","raise"}, default="omit") – NaN handling.

  • n_jobs (int, default=1) – Parallel workers for MWU feature loop.

  • parallel_backend ({"processes","threads","none"}, default="processes") – Backend for parallel execution.

  • chunk_size_features (int, default=256) – Feature chunk size.

  • mwu_kwargs (dict, optional) – Extra keyword args forwarded to scipy.stats.mannwhitneyu.

Returns:

pvals – MWU p-values per feature.

Return type:

numpy.ndarray, shape (n_features,)

Notes

MWU is not vectorized across features in SciPy, so we compute it feature-wise. This implementation is memory efficient because it does not allocate X_group/X_rest.

netflow.methods.stats._pair_chunk(
start_end: Tuple[int, int],
) Tuple[int, ndarray][source]#

Compute p-values for a chunk of feature columns for MWU or Wilcoxon.

Parameters:

start_end (tuple(int, int)) – Half-open feature interval (start, end).

Returns:

  • start (int) – Start feature index.

  • pvals_chunk (numpy.ndarray) – P-values for features in [start, end).

Notes

This is used only for tests that SciPy does not vectorize across features: MWU and Wilcoxon.

netflow.methods.stats._pair_init_worker(
X1: ndarray,
X2: ndarray,
*,
test: Literal['MWU', 't-test'],
alternative: str,
nan_policy: str,
test_kwargs: Dict[str, Any],
) None[source]#

Initializer for parallel pairwise feature-wise workers (MWU/Wilcoxon).

Parameters:
  • X1 (numpy.ndarray) – Matrices shaped (n_samples1, n_features) and (n_samples2, n_features).

  • X2 (numpy.ndarray) – Matrices shaped (n_samples1, n_features) and (n_samples2, n_features).

  • test ({"MWU","wilcoxon"}) – Feature-wise tests supported by this worker.

  • alternative ({"two-sided","less","greater"}) – Alternative hypothesis passed to SciPy.

  • nan_policy ({"omit","raise"}) – NaN handling.

  • test_kwargs (dict) – Extra keyword args forwarded to SciPy test functions.

Return type:

None

netflow.methods.stats.compute_pearson(
row_x,
row_y,
)[source]#

Compute Pearson correlation with each row in Y for a given row in X.

Parameters:
  • row_x (array_like) – 1-D arrays representing multiple observations of a single variable. The correlation is computed between row_x and row_y.

  • row_y (array_like) – 1-D arrays representing multiple observations of a single variable. The correlation is computed between row_x and row_y.

Returns:

  • correlation (float) – The correlation.

  • p_value (float) – The p-value.

netflow.methods.stats.compute_pearson_parallel(
X,
Y,
num_processors=None,
chunksize=None,
)[source]#

Compute Pearson correlation between each row of X and all rows of Y in parallel.

Parameters:
  • X (pandas.DataFrame) – Dataframes containing multiple variables and observations. Each row represents a variable and each column is an observation of each variable. X and Y must have the same number of columns (i.e., the same observations) but they need not have the same number of variables.

  • Y (pandas.DataFrame) – Dataframes containing multiple variables and observations. Each row represents a variable and each column is an observation of each variable. X and Y must have the same number of columns (i.e., the same observations) but they need not have the same number of variables.

  • num_processors (int) – Number of processors to use. Defaults to None (uses all available).

  • chunksize

Returns:

  • correlations (dict) – The resulting correlations in the form {index_row_X: {index_row_Y: corr}}

  • p_values (dict) – The resulting p_values in the form {index_row_X: {index_row_Y: p_value}}

netflow.methods.stats.compute_spearman(
row_x,
row_y,
)[source]#

Compute Spearman correlation with each row in Y for a given row in X.

Parameters:
  • row_x (array_like) – 1-D arrays representing multiple observations of a single variable. The correlation is computed between row_x and row_y.

  • row_y (array_like) – 1-D arrays representing multiple observations of a single variable. The correlation is computed between row_x and row_y.

Returns:

  • correlation (float) – The correlation.

  • p_value (float) – The p-value.

netflow.methods.stats.compute_spearman_parallel(
X,
Y,
num_processors=None,
chunksize=None,
)[source]#

Compute Spearman correlation between each row of X and all rows of Y in parallel.

Parameters:
  • X (pandas.DataFrame) – Dataframes containing multiple variables and observations. Each row represents a variable and each column is an observation of each variable. X and Y must have the same number of columns (i.e., the same observations) but they need not have the same number of variables.

  • Y (pandas.DataFrame) – Dataframes containing multiple variables and observations. Each row represents a variable and each column is an observation of each variable. X and Y must have the same number of columns (i.e., the same observations) but they need not have the same number of variables.

  • num_processors (int) – Number of processors to use. Defaults to None (uses all available).

  • chunksize

Returns:

  • correlations (dict) – The resulting correlations in the form {index_row_X: {index_row_Y: corr}}

  • p_values (dict) – The resulting p_values in the form {index_row_X: {index_row_Y: p_value}}

netflow.methods.stats.mann_whitney_u_test(
values1,
values2,
alternative='two-sided',
**kwargs,
)[source]#

Perform the Mann-Whitney U rank test on two independent samples.

The Mann-Whitney U test is a nonparametric test of the null hypothesis that the distribution underlying sample x is the same as the distribution underlying sample y. It is often used as a test of difference in location between distributions.

Computed via scipy.stats.mannwhitneyu.

Parameters:
  • values1 (array-like) – The arrays must have the same shape, except in the dimension corresponding to axis (the first, by default), which can be specified in kwargs.

  • values2 (array-like) – The arrays must have the same shape, except in the dimension corresponding to axis (the first, by default), which can be specified in kwargs.

  • alternative ({'two-sided', 'less', 'greater'}, optional) –

    Defines the alternative hypothesis. The following options are available (default is ‘two-sided’):

    • ’two-sided’: the means of the distributions underlying the samples are unequal.

    • ’less’: the mean of the distribution underlying the first sample is less than the mean of the distribution underlying the second sample.

    • ’greater’: the mean of the distribution underlying the first sample is greater than the mean of the distribution underlying the second sample.

  • kwargs (dict) – Key-word arguments passed to scipy.stats.mannwhitneyu.

Returns:

p_value – The p-value.

Return type:

float

netflow.methods.stats.perform_stat_test(
values1,
values2,
test_type: str,
**kwargs,
) float[source]#

Choose and perform a statistical test, matching your original dispatch behavior.

Parameters:
  • values1 (array-like) – Vectors of measurements. For ‘wilcoxon’, these are paired vectors.

  • values2 (array-like) – Vectors of measurements. For ‘wilcoxon’, these are paired vectors.

  • test_type ({"t-test","MWU","wilcoxon"}) –

    Which test to apply:
    • ”t-test”: two-sample independent t-test

    • ”MWU”: Mann–Whitney U test

    • ”wilcoxon”: Wilcoxon signed-rank test (paired)

  • **kwargs (dict) –

    Forwarded to the underlying SciPy test wrapper. In particular:
    • You may pass alternative in kwargs for all supported tests.

Returns:

p_value – P-value from the chosen test.

Return type:

float

Raises:

ValueError – If an invalid test_type is provided.

netflow.methods.stats.perform_stat_test_matrix(
X1: DataFrame | ndarray,
X2: DataFrame | ndarray,
*,
test: Literal['MWU', 't-test'] = 'MWU',
alternative: Literal['two-sided', 'less', 'greater'] = 'two-sided',
nan_policy: Literal['omit', 'raise'] = 'omit',
equal_var: bool = False,
n_jobs: int = 1,
parallel_backend: Literal['processes', 'threads', 'none'] = 'processes',
chunk_size_features: int = 256,
test_kwargs: Dict[str, Any] | None = None,
) ndarray[source]#

Compute per-feature p-values comparing two groups of samples.

This function assumes inputs are samples x features and returns a vector of p-values of length n_features.

Parameters:
  • X1 (pandas.DataFrame or numpy.ndarray) – Two matrices of shape (n_samples1, n_features) and (n_samples2, n_features). If DataFrames are given, they are converted once to NumPy arrays.

  • X2 (pandas.DataFrame or numpy.ndarray) – Two matrices of shape (n_samples1, n_features) and (n_samples2, n_features). If DataFrames are given, they are converted once to NumPy arrays.

  • test ({"MWU","t-test","wilcoxon"}, default="MWU") –

    Statistical test to perform:

    • ”t-test”: independent two-sample t-test via scipy.stats.ttest_ind with axis=0

    • ”MWU”: Mann–Whitney U via scipy.stats.mannwhitneyu (feature-wise loop/parallel)

    • ”wilcoxon”: Wilcoxon signed-rank via scipy.stats.wilcoxon (paired; requires same n_samples)

  • alternative ({"two-sided","less","greater"}, default="two-sided") – Alternative hypothesis passed to SciPy.

  • nan_policy ({"omit","raise"}, default="omit") – NaN handling: - “omit”: omit NaNs feature-wise (t-test uses SciPy nan_policy; MWU/Wilcoxon omit manually) - “raise”: raise if NaNs are present (t-test uses SciPy; MWU/Wilcoxon will yield NaNs or raise upstream)

  • equal_var (bool, default=False) – Only used for “t-test”. False means Welch’s t-test.

  • n_jobs (int, default=1) – Number of workers for MWU/Wilcoxon feature loops. If 1, runs serially.

  • parallel_backend ({"processes","threads","none"}, default="processes") – Backend used for MWU/Wilcoxon parallel execution.

  • chunk_size_features (int, default=256) – Number of features per parallel chunk for MWU/Wilcoxon.

  • test_kwargs (dict, optional) – Extra keyword args forwarded to the underlying SciPy test. - MWU: forwarded to mannwhitneyu - Wilcoxon: forwarded to wilcoxon - t-test: forwarded to ttest_ind (in addition to alternative/equal_var/nan_policy)

Returns:

pvals – Raw p-values per feature.

Return type:

numpy.ndarray, shape (n_features,)

Raises:

ValueError – If feature dimensions mismatch, or if wilcoxon is requested with mismatched sample counts.

Notes

Why SciPy t-test is enough here: - ttest_ind supports vectorization across features with axis=0,

so there is no Python loop per feature.

  • Using SciPy directly keeps this implementation simple and robust.

Why MWU/Wilcoxon still loop: - SciPy does not vectorize these tests across features, so looping (and optional parallelism)

is necessary if you want one p-value per feature.

netflow.methods.stats.sig_feats_by_group(
groups: Series,
feats: DataFrame,
*,
test: Literal['MWU', 't-test'] = 'MWU',
alpha: float = 0.05,
method: str = 'fdr_bh',
min_group_size: int = 10,
samples_axis: Literal['auto', 'index', 'columns'] = 'auto',
alternative: Literal['two-sided', 'less', 'greater'] = 'two-sided',
nan_policy: Literal['omit', 'raise'] = 'omit',
equal_var: bool = False,
n_jobs: int = 1,
parallel_backend: Literal['processes', 'threads', 'none'] = 'processes',
chunk_size_features: int = 256,
test_kwargs: Dict[str, Any] | None = None,
top_n: int | None = None,
return_type: Literal['dict', 'wide'] = 'wide',
add_effect_sizes: bool = True,
log2fc_pseudocount: float = 1e-09,
) Dict[Any, DataFrame] | DataFrame[source]#

Compute per-group feature significance vs the rest of the cohort.

For each group label g, compare samples in g vs all other samples, producing per-feature p-values and multiple-test-corrected p-values.

Parameters:
  • groups (pandas.Series) – Group labels indexed by sample ID.

  • feats (pandas.DataFrame) –

    Feature matrix in either orientation: - samples x features (samples on index) or - features x samples (samples on columns)

    samples_axis controls inference/forcing of orientation.

  • test ({"MWU","t-test","wilcoxon"}, default="MWU") –

    Statistical test to perform. - “MWU”: Mann–Whitney U (unpaired, rank-based) - “t-test”: independent two-sample t-test (Welch by default) - “wilcoxon”: Wilcoxon signed-rank (paired)

    IMPORTANT: “wilcoxon” is not a generic group-vs-rest test (paired design). This function will raise if test=”wilcoxon”.

  • alpha (float, default=0.05) – Error rate for multiple testing correction.

  • method (str, default="fdr_bh") –

    Multiple test correction method.

    • bonferroni : one-step correction

    • sidak : one-step correction

    • holm-sidak : step down method using Sidak adjustments

    • holm : step-down method using Bonferroni adjustments

    • simes-hochberg : step-up method (independent)

    • hommel : closed method based on Simes tests (non-negative)

    • fdr_bh : Benjamini/Hochberg (non-negative)

    • fdr_by : Benjamini/Yekutieli (negative)

    • fdr_tsbh : two stage fdr correction (non-negative)

    • fdr_tsbky : two stage fdr correction (non-negative)

  • min_group_size (int, default=10) – Minimum number of samples required for a group to be tested. Must be >= 3.

  • samples_axis ({"auto","index","columns"}, default="auto") – Orientation control passed to _coerce_samples_x_features.

  • alternative ({"two-sided","less","greater"}, default="two-sided") – Alternative hypothesis.

  • nan_policy ({"omit","raise"}, default="omit") – NaN handling.

  • equal_var (bool, default=False) – Used for t-test only. False means Welch’s t-test.

  • n_jobs (int, default=1) – Workers for MWU feature-wise computation. If 1, runs serially.

  • parallel_backend ({"processes","threads","none"}, default="processes") – Backend for MWU parallelism.

  • chunk_size_features (int, default=256) – Feature chunk size for MWU parallelism.

  • test_kwargs (dict, optional) – Extra kwargs forwarded to SciPy test functions.

  • top_n (int, optional) – If provided, keep only the top_n features per group after sorting by corrected p-value then raw p-value.

  • return_type ({"wide", "dict"}, default=False) – If “wide” (default), return a wide DataFrame with MultiIndex columns (group, metric). If “dict”, return a dict mapping group -> per-group record DataFrame.

  • add_effect_sizes (bool, default=True) –

    If True, include effect size summaries per feature:

    • n_in: number of samples within group (constant across features)

    • n_out: number of samples outside group (constant across features)

    • mean_in: mean feature value within group

    • mean_out: mean feature value outside group

    • mean_diff: mean_in - mean_out

    • log2fc: log2((mean_in + pseudocount) / (mean_out + pseudocount))

  • log2fc_pseudocount (float, default=1e-9) – Pseudocount added to means for log2 fold-change to avoid division by zero and log(0). Only used when add_effect_sizes=True.

Returns:

records_full

If return_type=”wide”:

DataFrame with index=features and columns MultiIndex (group, metric), where metric in {“p-value”,”corrected p-value”}.

If return_type=”dict”:

dict[group_label -> DataFrame(index=features, columns=[“p-value”,”corrected p-value”])] Each per-group DataFrame is sorted by corrected p-value then p-value.

Return type:

dict or pandas.DataFrame

Raises:

ValueError – If min_group_size < 3, method invalid, alignment fails, or test=”wilcoxon”.

Notes

  • t-test uses SciPy’s vectorized implementation across features via axis=0.

  • MWU is computed feature-wise (SciPy is not vectorized). This implementation avoids large submatrix allocations by gathering per-feature vectors via indices.

netflow.methods.stats.stat_test(
df1: DataFrame,
df2: DataFrame,
test: Literal['MWU', 't-test'] = 'MWU',
alpha: float = 0.05,
method: str = 'fdr_bh',
samples_axis: Literal['auto', 'index', 'columns'] = 'auto',
alternative: Literal['two-sided', 'less', 'greater'] = 'two-sided',
nan_policy: Literal['omit', 'raise'] = 'omit',
equal_var: bool = False,
n_jobs: int = 1,
parallel_backend: Literal['processes', 'threads', 'none'] = 'processes',
chunk_size_features: int = 256,
test_kwargs: Dict[str, Any] | None = None,
) DataFrame[source]#

Perform statistical test between groups/datasets and apply multiple test correction.

This compares df1 vs df2 feature-by-feature, returning raw and corrected p-values.

The statistical tests are Computed via scipy.stats.

Parameters:
  • df1 (pandas.DataFrame) –

    The measurements, where rows are features and columns are observations. The dataframes must have the same number of features (rows). If test='wilcoxon', they must also have the same number of observationas (columns).

    Note: Can now handle datasets oriented as:

    • samples x features (samples on index, features on columns)

    • features x samples (features on index, samples on columns)

    The samples_axis argument controls whether orientation is inferred or enforced.

    If test=”wilcoxon” (paired), the sample dimension must match and be aligned after coercion.

  • df2 (pandas.DataFrame) –

    The measurements, where rows are features and columns are observations. The dataframes must have the same number of features (rows). If test='wilcoxon', they must also have the same number of observationas (columns).

    Note: Can now handle datasets oriented as:

    • samples x features (samples on index, features on columns)

    • features x samples (features on index, samples on columns)

    The samples_axis argument controls whether orientation is inferred or enforced.

    If test=”wilcoxon” (paired), the sample dimension must match and be aligned after coercion.

  • test ({"MWU","t-test","wilcoxon"}) –

    The statistical test that should be performed. Options are:

    • ’MWU’ : Mann Whitney-U Test (default).

    • ’t-test’ : T-test

    • ’wilcoxon’ : Wilcoxon Signed Rank Test

  • alpha (float) – The family-wise error rate (FWER) passed to statsmodels multipletests, should be between 0 and 1.

  • method (str) –

    Method for multiple test correction, default=’fdr_bh’.

    Options:

    • bonferroni : one-step correction

    • sidak : one-step correction

    • holm-sidak : step down method using Sidak adjustments

    • holm : step-down method using Bonferroni adjustments

    • simes-hochberg : step-up method (independent)

    • hommel : closed method based on Simes tests (non-negative)

    • fdr_bh : Benjamini/Hochberg (non-negative)

    • fdr_by : Benjamini/Yekutieli (negative)

    • fdr_tsbh : two stage fdr correction (non-negative)

    • fdr_tsbky : two stage fdr correction (non-negative)

  • samples_axis ({"auto","index","columns"}, default="auto") –

    Orientation control:

    • ”index”: enforce samples on df.index (df is samples x features)

    • ”columns”: enforce samples on df.columns (df is features x samples), then transpose

    • ”auto”: infer orientation strictly from label agreement between df1 and df2 (see _coerce_two_dfs_samples_x_features for details)

  • alternative ({"two-sided","less","greater"}, default="two-sided") – Alternative hypothesis passed to the underlying SciPy test.

  • nan_policy ({"omit","raise"}, default="omit") – NaN handling. For t-test, this is passed to SciPy. For MWU/Wilcoxon, NaNs are handled feature-wise by the underlying implementation.

  • equal_var (bool, default=False) – Used for t-test only. False means Welch’s t-test.

  • n_jobs (int, default=1) – Workers for MWU/Wilcoxon (feature-wise). If 1, runs serially.

  • parallel_backend ({"processes","threads","none"}, default="processes") – Backend for MWU/Wilcoxon.

  • chunk_size_features (int, default=256) – Feature chunk size for MWU/Wilcoxon.

  • test_kwargs (dict) – Key-word arguments passed to scipy.stats for performing the statistical test.

Returns:

  • record (pandas.DataFrame) – DataFrame indexed by feature name with columns: - “p-value” - “corrected p-value”

  • DataFrame indexed by feature name with columns

    • “p-value”

    • ”corrected p-value”

Notes

  • For unpaired tests (“MWU” and “t-test”), this wrapper routes to the optimized group-vs-rest engine _sig_feats_by_group_core by constructing a two-group grouping vector over the concatenated observations.

  • MWU can be parallelized across feature chunks using additional kwargs:
    • n_jobs (int): number of workers (default 1)

    • parallel_backend ({“processes”,”threads”,”none”}): backend (default “processes”)

    • chunk_size_features (int): features per chunk (default 256)

  • For “wilcoxon” (paired), a paired signed-rank test is computed between df1 and df2 columns, feature-wise.

  • These keys (if present) are consumed by the wrapper and not forwarded to SciPy:
    • alternative : {“two-sided”,”less”,”greater”} (default “two-sided”)

    • nan_policy : {“omit”,”raise”} (default “omit”) * For t-test, this is passed to SciPy. * For MWU/wilcoxon, NaNs are omitted feature-wise when nan_policy=”omit”.

    • equal_var : bool (default False) for t-test (Welch vs pooled)

    • n_jobs : int (default 1) for MWU parallelism

    • parallel_backend : {“processes”,”threads”,”none”} (default “processes”) for MWU

    • chunk_size_features : int (default 256) for MWU

netflow.methods.stats.t_test(
values1,
values2,
alternative='two-sided',
equal_var=False,
**kwargs,
)[source]#

Calculate the T-test for the means of two independent samples of scores.

This is a test for the null hypothesis that 2 independent samples have identical average (expected) values. This test assumes that the populations have identical variances by default.

Computed via scipy.stats.ttest_ind.

Parameters:
  • values1 (array-like) – The arrays must have the same shape, except in the dimension corresponding to axis (the first, by default), which can be specified in kwargs.

  • values2 (array-like) – The arrays must have the same shape, except in the dimension corresponding to axis (the first, by default), which can be specified in kwargs.

  • alternative ({'two-sided', 'less', 'greater'}, optional) –

    Defines the alternative hypothesis. The following options are available (default is ‘two-sided’):

    • ’two-sided’: the means of the distributions underlying the samples are unequal.

    • ’less’: the mean of the distribution underlying the first sample is less than the mean of the distribution underlying the second sample.

    • ’greater’: the mean of the distribution underlying the first sample is greater than the mean of the distribution underlying the second sample.

  • equal_var (bool, default=False) – Passed to scipy.stats.ttest_ind; default False corresponds to Welch’s t-test. If True, performs the standard independent 2 sample test that assumes equal population variances.

  • kwargs (dict) – Key-word arguments passed to scipy.stats.ttest_ind.

Returns:

p_value – The p-value.

Return type:

float

netflow.methods.stats.wilcoxon_signed_rank_test(
values1,
values2=None,
alternative='two-sided',
**kwargs,
)[source]#

The Wilcoxon signed-rank test.

The Wilcoxon signed-rank test tests the null hypothesis that two related paired samples come from the same distribution. In particular, it tests whether the distribution of the differences x - y is symmetric about zero. It is a non-parametric version of the paired T-test.

Computed via scipy.stats.wilcoxon.

Parameters:
  • values1 (array-like) – Either the first set of measurements (in which case y is the second set of measurements), or the differences between two sets of measurements (in which case y is not to be specified.) Must be one-dimensional.

  • values2 (array-like) – Optional. Either the second set of measurements (if x is the first set of measurements), or not specified (if x is the differences between two sets of measurements.) Must be one-dimensional.

  • alternative ({'two-sided', 'less', 'greater'}, optional) –

    Defines the alternative hypothesis. The following options are available (default is ‘two-sided’):

    • ’two-sided’: the means of the distributions underlying the samples are unequal.

    • ’less’: the mean of the distribution underlying the first sample is less than the mean of the distribution underlying the second sample.

    • ’greater’: the mean of the distribution underlying the first sample is greater than the mean of the distribution underlying the second sample.

  • kwargs (dict) – Key-word arguments passed to scipy.stats.wilcoxon.

Returns:

p_value – The p-value.

Return type:

float