scDLKit#

AnnData-native deep-learning baselines for single-cell workflows.

scDLKit is designed to sit alongside Scanpy, not replace it. Use Scanpy for loading and exploratory single-cell analysis, then use scDLKit to train, compare, and evaluate baseline deep-learning models with a small, reproducible API.

Install

Set up the CPU or GPU tutorial path from PyPI, including the new tutorials extra.

Install
Scanpy PBMC quickstart

Start with the primary notebook tutorial built on scanpy.datasets.pbmc3k_processed().

Scanpy PBMC quickstart
Tutorials

Open the notebook walkthroughs for representation learning, model comparison, and classification.

Tutorials
Scanpy integration

See how to store scDLKit latent embeddings in adata.obsm and continue with standard Scanpy analysis.

Scanpy integration

Why scDLKit#

  • AnnData-native model training and evaluation

  • baseline-first deep-learning workflows for single-cell data

  • one shared CPU/GPU path with device="auto"

  • reproducible reports, plots, and tutorial notebooks

  • clean separation between model workflows and Scanpy analysis

  • release hardening driven by built-in benchmark gates before defaults change

  • gene-expression scope first, while defaults and tutorials are still being hardened

Example outputs#

Latent UMAP from the Scanpy PBMC quickstart

Latent UMAP from the Scanpy PBMC quickstart. A healthy quickstart run should separate the major PBMC populations into broad regions rather than collapsing into a single mixed cloud.#

Latent PCA from the synthetic smoke example

Latent embedding produced by the first end-to-end scDLKit walkthrough.#

PBMC comparison plot from the benchmark tutorial

Benchmark comparison chart from the PBMC model-comparison tutorial.#