grassp.pp.add_markers#
- add_markers(data, species, authors=None, uniprot_id_column=None, add_colors=True)[source]#
Annotate proteins with marker annotations from literature.
Matches protein IDs in
.obsagainst a collection of marker annotations from different authors. Note that marker IDs are species-specific and may not be UniProt accessions (see table below).Marker annotations are sourced from:
authors
source
hein2024_gt_component
Marker list used in Hein et al. 2024, Cell, https://doi.org/10.1016/j.cell.2024.11.028
hein2024_component
Full annotations from Hein et al. 2024, Cell, https://doi.org/10.1016/j.cell.2024.11.028
lilley, christopher, geladaki, itzhak, villaneuva, christoforou
Obtained from pRoloc. See: https://bioconductor.org/packages/pRoloc/ and https://lgatto.github.io/pRoloc/reference/pRolocmarkers.html
Protein ID types by species:
Species Code
Common Name
ID Type
Example ID
atha
Arabidopsis thaliana
TAIR/Araport
AT1G01620
dmel
Drosophila melanogaster
UniProt
A1Z6P3
ggal
Gallus gallus (Chicken)
IPI
IPI00570752.1
hsap
Homo sapiens (Human)
UniProt
A0AVT1
mmus
Mus musculus (Mouse)
UniProt
A2AJ15
scer
Saccharomyces cerevisiae (Yeast)
UniProt
D6VTK4
toxo
Toxoplasma gondii
ToxoDB Gene IDs
TGME49_200250
tryp
Trypanosoma brucei
TriTrypDB Gene IDs
Tb11.v5.0162
This function modifies the AnnData object in-place by adding marker annotation columns to
.obs.- Parameters:
- data
AnnData AnnData object.
- species
str Species code to determine which marker file to read. Examples: ‘hsap’ (human), ‘mmus’ (mouse), ‘scer’ (yeast), ‘atha’ (Arabidopsis), ‘dmel’ (fly), ‘toxo’ (Toxoplasma), ‘tryp’ (Trypanosoma), ‘ggal’ (chicken).
- authors
list[str] |str|None(default:None) Specific author column(s) to include from the marker file. If None, includes all available author columns. Can be a single author name (string) or a list of author names.
- uniprot_id_column
str|None(default:None) Column in
.obscontaining protein IDs (see the specific ID needed in the description above). If None, uses.obs_names.- add_colors
bool(default:True) If True, automatically add color mappings to
.unsfor each marker column, following scanpy plotting conventions. Colors are stored as'{author}_colors'lists matching categorical order.
- data
- Return type:
- Returns:
None Modifies
data.obsin-place by adding marker annotation columns (converted to categorical dtype). Ifadd_colors=True, also adds color mappings todata.unsas'{author}_colors'lists.
Examples
>>> import grassp as gr >>> import pandas as pd >>> adata = gr.datasets.hein_2024(enrichment='raw') >>> # Add specific author annotations >>> gr.pp.add_markers(adata, species='hsap', authors=['christopher']) Added christopher annotations for ... >>> # Check categorical dtype and colors >>> isinstance(adata.obs['christopher'].dtype, pd.CategoricalDtype) True >>> 'christopher_colors' in adata.uns True >>> # Disable automatic color mapping >>> gr.pp.add_markers(adata, species='hsap', authors=['lilley'], add_colors=False) Added lilley annotations for ...