grassp.tl.knn_annotation

Contents

grassp.tl.knn_annotation#

knn_annotation(data, gt_col, fix_markers=False, class_balance=True, min_probability=0.5, inplace=True, obsp_key='connectivities', key_added='knn_annotation')[source]#

Propagate categorical annotations along the k-NN graph.

For each observation the function inspects its neighbourhood in adata.obsp[obsp_key] (generated by scanpy.pp.neighbors()) and calculates the a weighted probability for each label category.

Parameters:
data

anndata.AnnData with a populated neighbour graph (distances or connectivities).

gt_col

Observation column containing the source annotations to be propagated.

fix_markers default: False

If True marker probabilities do not get overwritten by the propagated labels.

class_balance default: True

If True ground truth compartments with a lot of proteins are downweighted proportional to their size to prevent them from dominating the propagated labels.

min_probability default: 0.5

If the probability of the most probable label is below this threshold, the label is set to np.nan.

obsp_key default: 'connectivities'

Name of the neighbour connectivity graph to use (default "connectivities").

key_added default: 'knn_annotation'

Name of the new column that will hold the propagated annotation (default "knn_annotation").

Returns:

Modified anndata object with the following new entries: - .obsm[f”{key_added}_probabilities”] containing the propagated probabilities - .obs[f”{key_added}”] containing the propagated labels (most probable label) - .uns[f”{key_added}_colors”] to make sure plotting uses the same colors as the ground truth labels - .obs[f”{key_added}_probability”] containing the probability of the most probable label