What is scArches?
scArches allows your single-cell query data to be analyzed by integrating it into a reference atlas. To map your data, you need an integrated atlas using one of the reference-building methods for different applications that are supported by scArches which are, including:
Annotating a single-cell dataset using a reference atlas: You can check following models/tutorials using scPoli (De Donno et al., 2022) or scANVI (Xu et al., 2019 ).
Identify novel cell states present in your data by mapping to an atlas: If you want to detect cell-states affected by disease or novel subpopulations see treeArches (Michielsen*, Lotfollahi* et al., 2022) and also similar use case by mapping to Human Lung cell atlas.
Multimodal single-cell atlases: You can check the tutorial for Multigrate (Litinetskaya*, Lotfollahi* et al., 2022) to work with CITE-seq + Multiome (ATAC+ RNA). Additionally, you can check mvTCR (Drost et al., 2022) for joint analysis of T-cell Receptor (TCR) and scRNAseq data. To impute missing surface proteins for your query single-cell RNAseq data using a CITE-seq reference, see totalVI (Gayoso et al., 2019).
Data integration/batch correction: For integration of multiple scRNAseq datasets see scVI (Lopez et al, 2018) or trVAE (Lotfollahi et al, 2020). In case of strong batch effect and access to cell-type labels, consider using scGen (Lotfollahi et al., 2019).
Spatial transcriptomics: To map scRNAseq data to a spatial reference and infer spatial locations check SageNet (Heidari et al., 2022).
Querying gene programs in single-cell atlases: Using gene programs (GPs), you can embed your datasets into known subspaces (e.g., interferon signaling) and see the activity of your query dataset within desired GPs. You can use available GP databases (e.g, GO pathways) or your curated GPs, see expiMap (Lotfollahi*, Rybakov* et al., 2023). One can also learn novel GPs as shown here.
Links to the papers can be found here.