scDLKit#

AnnData-native deep-learning baselines for single-cell workflows.

scDLKit is designed to sit alongside Scanpy, not replace it. Use Scanpy for loading and exploratory single-cell analysis, then use scDLKit to train, compare, and evaluate baseline deep-learning models with a small, reproducible API.

Install

Set up the CPU or GPU tutorial path from PyPI, including the new tutorials extra.

Install

Scanpy PBMC quickstart

Start with the primary notebook tutorial built on scanpy.datasets.pbmc3k_processed().

Scanpy PBMC quickstart

Tutorials

Open the notebook walkthroughs for representation learning, model comparison, and classification.

Tutorials

Scanpy integration

See how to store scDLKit latent embeddings in adata.obsm and continue with standard Scanpy analysis.

Scanpy integration

Why scDLKit#

AnnData-native model training and evaluation
baseline-first deep-learning workflows for single-cell data
one shared CPU/GPU path with device="auto"
reproducible reports, plots, and tutorial notebooks
clean separation between model workflows and Scanpy analysis
release hardening driven by built-in benchmark gates before defaults change
gene-expression scope first, while defaults and tutorials are still being hardened

Example outputs#

Latent UMAP from the Scanpy PBMC quickstart. A healthy quickstart run should separate the major PBMC populations into broad regions rather than collapsing into a single mixed cloud.#

Latent PCA from the synthetic smoke example — Latent embedding produced by the first end-to-end scDLKit walkthrough.#

PBMC comparison plot from the benchmark tutorial — Benchmark comparison chart from the PBMC model-comparison tutorial.#

Recommended learning path#

Install the tutorial dependencies from PyPI.
Run the Scanpy PBMC quickstart notebook in the default quickstart profile.
Switch that notebook to the full profile when you want a longer run and stronger qualitative separation.
Continue with the model comparison notebook, which treats PCA as the classical reference baseline.
Use the classification notebook once you want a supervised baseline.
Keep the synthetic notebook only as a minimal smoke or fallback path.