grassp.pp.highly_variable_proteins#
- highly_variable_proteins(data, inplace=True, n_top_proteins=None, flavor='seurat', subset=False, batch_key=None, **kwargs)[source]#
Identify highly variable proteins.
- Parameters:
- data
AnnData
The annotated data matrix with proteins as observations (rows).
- inplace
bool
(default:True
) Whether to store results in data.obs or return them.
- n_top_proteins
Optional
[int
] (default:None
) Number of highly-variable proteins to keep. If None, use flavor-specific defaults.
- flavor
Literal
['seurat'
,'cell_ranger'
,'seurat_v3'
,'seurat_v3_paper'
] (default:'seurat'
) Method for identifying highly variable proteins. Options are: ‘seurat’ - Seurat’s method (default) ‘cell_ranger’ - Cell Ranger’s method ‘seurat_v3’ - Seurat v3 method ‘seurat_v3_paper’ - Method from Seurat v3 paper
- subset
bool
(default:False
) Whether to subset the data to highly variable proteins.
- batch_key
Optional
[str
] (default:None
) If specified, highly-variable proteins are selected within each batch separately.
- **kwargs
Additional arguments to pass to scanpy.pp.highly_variable_genes.
- data
- Return type:
DataFrame
|None
- Returns:
pandas.DataFrame or None If inplace=False, returns DataFrame of highly variable proteins. If inplace=True, returns None and stores results in data.obs.
Notes
This function identifies highly variable proteins using methods adapted from single-cell RNA sequencing analysis. The results are stored in data.obs with the following fields:
highly_variable: boolean indicator
means: mean expression
dispersions: dispersion of expression
dispersions_norm: normalized dispersion