grassp.pp.calculate_enrichment_vs_untagged#
- calculate_enrichment_vs_untagged(data, covariates=None, subcellular_enrichment_column='subcellular_enrichment', untagged_name='UNTAGGED', original_intensities_key=None, drop_untagged=True, keep_raw=True)[source]#
Calculates enrichment scores and p-values by comparing tagged samples against untagged controls.
This function performs a t-test to determine the significance of protein enrichment in tagged samples relative to untagged controls. The enrichment is calculated as the log2 fold change of median intensities.
- Parameters:
- data
AnnData
An AnnData object with protein intensities in
.X
.- covariates
Optional
[Sequence
[str
]] (default:None
) A list of column names in
data.var
to group samples. If None, columns starting withcovariate_
are used.- subcellular_enrichment_column
str
(default:'subcellular_enrichment'
) The column in
.var
that contains subcellular enrichment labels.- untagged_name
str
(default:'UNTAGGED'
) The label in
subcellular_enrichment_column
that identifies untagged control samples.- original_intensities_key
Optional
[str
] (default:None
) If specified, the original intensity values are stored in
data.layers[original_intensities_key]
.- drop_untagged
bool
(default:True
) If True, untagged samples are removed from the returned AnnData object.
- keep_raw
bool
(default:True
) If True, the original unaggregated data is stored in
.raw
.
- data
- Return type:
- Returns:
AnnData Aggregated AnnData object with enrichment scores and p-values, with:
.X
: log2 fold changes relative to untagged controls..layers["pvals"]
: p-values from the t-tests..layers[original_intensities_key]
: raw intensity values iforiginal_intensities_key
is set.