grassp.pp.aggregate_proteins

grassp.pp.aggregate_proteins#

aggregate_proteins(data, grouping_columns, agg_func=<function median>)[source]#

Aggregates protein intensities across samples using a given function.

Parameters:
data AnnData

The annotated data matrix with proteins as observations (rows).

grouping_columns Union[str, List[str]]

Column name(s) in data.obs to group samples into replicates.

agg_func Callable[[ndarray, Optional[int]], ndarray] (default: <function median at 0x11512c2b0>)

Function to use for aggregation. Defaults to np.median.

Return type:

AnnData

Returns:

A new AnnData object with aggregated expression values. The number of variables (samples) remains the same, but the number of observations (proteins) will correspond to the number of unique groups defined by grouping_columns.

Notes

This function is useful for e.g. combining multiple proteins that belong to the same gene. For each protein, it groups the samples based on the provided grouping_columns and then aggregates the intensity values using the specified agg_func.