grassp.pp.filter_proteins_per_replicate#
- filter_proteins_per_replicate(data, grouping_columns, min_replicates=1, min_samples=1, inplace=True)[source]#
Filter proteins based on detection in replicates.
- Parameters:
- data
AnnData
The annotated data matrix with proteins as observations (rows).
- grouping_columns
Union
[str
,List
[str
]] Column name(s) in data.var to group samples into replicates.
- min_replicates
int
(default:1
) Minimum number of replicates a protein must be detected in to pass filtering.
- min_samples
int
(default:1
) Minimum number of sample groups a protein must be detected in to pass filtering.
- inplace
bool
(default:True
) Whether to modify data in place or return a copy.
- data
- Return type:
ndarray
|None
- Returns:
numpy.ndarray or None If inplace=False, returns boolean mask indicating which proteins passed filtering. If inplace=True, returns None and modifies input data.
Notes
This function filters proteins based on their detection pattern across replicates. For each group of samples (defined by grouping_columns), it requires proteins to be detected in at least min_replicates samples. The protein must pass this threshold in at least min_samples groups to be kept.