grassp.pp.calculate_qc_metrics#
- calculate_qc_metrics(data, qc_vars=(), percent_top=(50, 100, 200, 500), layer=None, use_raw=False, inplace=True, log1p=True, var_type='proteins', expr_type='intensity', parallel=None)[source]#
Calculate quality control metrics.
- Parameters:
- data
AnnData
The annotated data matrix with proteins as observations (rows).
- qc_vars
Union
[Collection
[str
],str
] (default:()
) Keys for boolean columns in .var that indicate a protein is a quality control protein.
- percent_top
Optional
[Collection
[int
]] (default:(50, 100, 200, 500)
) Which proportions of top proteins to compute as QC metrics. Set to None to disable.
- layer
Optional
[str
] (default:None
) If provided, use data.layers[layer] for expression values.
- use_raw
bool
(default:False
) If True, use data.raw for expression values.
- inplace
bool
(default:True
) Whether to add metrics to input object or return them.
- log1p
bool
(default:True
) If True, compute log1p of expression values.
- var_type
str
(default:'proteins'
) Name for variables (e.g. ‘proteins’, ‘genes’, etc).
- expr_type
str
(default:'intensity'
) Name for expression values (e.g. ‘intensity’, ‘counts’, etc).
- parallel
Optional
[bool
] (default:None
) Whether to parallelize computation.
- data
- Return type:
tuple
[DataFrame
,DataFrame
] |None
- Returns:
- If not inplace, returns a tuple containing:
A DataFrame with protein-based metrics (var)
A DataFrame with sample-based metrics (obs)
If inplace, returns None and adds metrics to the input object.
Notes
- Calculates quality control metrics for both proteins and samples, including:
Number of samples expressing each protein
Total intensity per sample
Number of proteins detected per sample
Percentage of intensity from top proteins