grassp.tl.svm_annotation#
- svm_annotation(data, gt_col='markers', C=None, gamma=None, fix_markers=False, min_probability=0.5, inplace=True, key_added='svm_annotation', params_key=None, class_weight='balanced')[source]#
Classify proteins using SVM with marker-based training.
Trains an SVM classifier on marker proteins (non-NaN values in
gt_col) and predicts localization for all proteins. Hyperparameters can be provided manually or loaded from priorsvm_train()call.Similar to
knn_annotation()but uses SVM instead of graph propagation.- Parameters:
- data
AnnData anndata.AnnDatawith feature matrix in.X.- gt_col
str(default:'markers') Observation column with marker labels (NaN for unknowns).
- C
float|None(default:None) SVM regularization parameter. If
None, loads from.uns.- gamma
float|str|None(default:None) RBF kernel coefficient. If
None, loads from.uns.- fix_markers
bool(default:False) If
Truemarker proteins retain their original labels with probability 1.0.- min_probability
float(default:0.5) Confidence threshold; predictions below this are set to NaN.
- inplace
bool(default:True) If
Truemodify data in place; else return dict.- key_added
str(default:'svm_annotation') Base name for results (default
"svm_annotation").- params_key
str|None(default:None) Key to load hyperparameters from
.uns(default"svm.params").- class_weight None | dict | Literal['balanced']
- data
- Return type:
- Returns:
None or dict If
inplace=True, modifies data with:.obs[f"{key_added}"]: Predicted labels.obs[f"{key_added}_probability"]: Max probability per protein.obsm[f"{key_added}_probabilities"]: Full probability matrix.uns[f"{key_added}_colors"]: Color scheme (copied fromgt_col)
If
inplace=False, returns dict with predictions and probabilities.- Raises:
ValueError – If no hyperparameters found and none provided manually.
KeyError – If
gt_colnot found in.obs.
Examples
>>> import grassp as gr >>> import scanpy as sc >>> adata = gr.ds.hein_2024(enrichment="enriched")
##### Option 1: Annotate directly, with fixed hyperparameters ##### >>> gr.tl.svm_annotation( … adata, … gt_col=”hein2024_gt_component”, … min_probability=0.5, … C=10, … gamma=0.01, … ) >>> sc.pl.umap(adata, color=”svm_annotation”) # doctest: +SKIP
##### Option 2: Train SVM hyperparameters, then annotate ##### # When actually training, increase cv_repeats and cv_splits # We recommend >20 repeats with 5 splits >>> gr.tl.svm_train(adata, gt_col=”hein2024_gt_component”, cv_repeats=2, cv_splits=2, random_state=42) Fitting 4 folds for each of 54 candidates, totalling 216 fits >>> adata.uns[“svm.params”][“best_params”] {‘C’: 2.0, ‘gamma’: 0.01} >>> gr.tl.svm_annotation(adata, gt_col=”hein2024_gt_component”, min_probability=0.5) >>> sc.pl.umap(adata, color=”svm_annotation”) # doctest: +SKIP