grassp.pp.calculate_enrichment_vs_all

grassp.pp.calculate_enrichment_vs_all#

calculate_enrichment_vs_all(adata, covariates=None, subcellular_enrichment_column='subcellular_enrichment', enrichment_method='lfc', correlation_threshold=1.0, original_intensities_key='original_intensities', keep_raw=True, min_comparison_warning=None)[source]#

Calculates enrichment of each sample against all other samples as the background.

This function determines enrichment by comparing each sample’s protein intensities to a background composed of all other samples that are not highly correlated with it.

Parameters:
adata AnnData

An AnnData object with protein intensities in .X.

covariates Optional[Sequence[str]] (default: None)

A list of column names in adata.var for grouping. If None, columns starting with covariate_ are used.

subcellular_enrichment_column str (default: 'subcellular_enrichment')

The column in .var with subcellular enrichment labels.

enrichment_method Literal['lfc', 'proportion'] (default: 'lfc')

The method for calculating enrichment. Either "lfc" (log-fold change) or "proportion" (proportion of total intensity).

correlation_threshold float (default: 1.0)

The correlation value above which samples are excluded from the background to prevent comparing a sample against itself or highly similar ones.

original_intensities_key str | None (default: 'original_intensities')

If provided, the original intensities are stored in this layer.

keep_raw bool (default: True)

If True, the original unaggregated data is stored in .raw.

min_comparison_warning int | None (default: None)

If the number of control samples for a given comparison is below this threshold, a warning is issued.

Return type:

AnnData

Returns:

AnnData An AnnData object with enrichment scores and p-values.

  • .X contains enrichment scores (log2 fold changes or proportions).

  • .layers["pvals"] stores p-values from the t-tests.

  • .var["enriched_vs"] lists the conditions used as the background.