grassp.tl.calculate_cluster_enrichment#
- calculate_cluster_enrichment(data, cluster_key='leiden', gene_name_key='Gene_name_canonical', gene_sets='custom_goterms_genes_reviewed.gmt', obs_key_added='Cell_compartment', enrichment_ranking_metric='P-value', return_enrichment_res=True, inplace=True)[source]#
 Gene-set enrichment for each cluster.
For every category in
data.obs[cluster_key]the function performs an Enrichr analysis viagseapyusing the list of proteins (genes) present in that cluster. The most significant term (according toenrichment_ranking_metric) is written back todata.obsunderobs_key_added.- Parameters:
 - data 
AnnData Input
AnnDatawith proteins as observations.- cluster_key 
str(default:'leiden') Categorical column in
data.obscontaining cluster labels.- gene_name_key 
str(default:'Gene_name_canonical') Column in
data.obsthat holds gene symbols – required by gseapy.- gene_sets 
str(default:'custom_goterms_genes_reviewed.gmt') Gene set database to use for enrichment analysis
- obs_key_added 
str(default:'Cell_compartment') Name of the column that will store the top enriched term per cluster.
- enrichment_ranking_metric 
Literal['P-value','Odds Ratio','Combined Score'] (default:'P-value') Column used to rank results within each cluster. Valid options are
"P-value","Odds Ratio"and"Combined Score".- return_enrichment_res 
bool(default:True) If
Truereturn the fullpandas.DataFrameof Enrichr results.- inplace 
bool(default:True) If
True(default) annotate data in place. Otherwise a modified copy is returned.
- data 
 - Return type:
 - Returns:
 Behaviour depends on
inplaceandreturn_enrichment_res:inplace=True→ annotate data; return the resultsDataFrame if
return_enrichment_reselseNone.
inplace=False→ return either a newAnnDataor a
(adata, results)tuple.