grassp.tl.calculate_interfacialness_score#
- calculate_interfacialness_score(data, compartment_annotation_column, neighbors_key=None, obsp=None, exclude_category=None)[source]#
Quantify interfacialness of proteins across compartment boundaries.
The score is based on a modified Jaccard index computed from each protein’s immediate neighbourhood:
For a given protein count how many of its neighbours belong to each compartment (categories in
compartment_annotation_column
).Sort counts and take the two highest:
d1
andd2
for compartmentsk1
andk2
.Compute
score = (d1 + d2) / (N_k1 + N_k2 - (d1 + d2))
where
N_k
is the total number of proteins annotated as compartment k in the dataset. High scores indicate that a protein sits at an interface between two compartments.
New columns are appended to
data.obs
with the jaccard_ prefix.- Parameters:
- data
AnnData
anndata.AnnData
with a neighbour graph and compartment annotations for each protein.- compartment_annotation_column
str
Observation column containing the ground-truth compartment labels.
- neighbors_key
str
|None
(default:None
) Specify which neighbour graph to use (mirrors Scanpy conventions).
- obsp
str
|None
(default:None
) Specify which neighbour graph to use (mirrors Scanpy conventions).
- exclude_category
Union
[str
,List
[str
],None
] (default:None
) One or multiple category labels (e.g. ‘unknown’) to ignore when counting neighbours.
- data
- Return type:
- Returns:
- class:’~anndata.AnnData` object with additional columns:
jaccard_score
Interfacialness score.
jaccard_d1
,jaccard_d2
Counts of the two dominating neighbour compartments.
jaccard_k1
,jaccard_k2
Corresponding compartment labels.