scDLKit#
AnnData-native deep-learning baselines for single-cell workflows.
scDLKit is designed to sit alongside Scanpy, not replace it. Use Scanpy for loading and exploratory single-cell analysis, then use scDLKit to train, compare, and evaluate baseline deep-learning models with a small, reproducible API.
Set up the CPU or GPU tutorial path from PyPI, including the new tutorials extra.
Start with the primary notebook tutorial built on scanpy.datasets.pbmc3k_processed().
Open the notebook walkthroughs for representation learning, model comparison, and classification.
See how to store scDLKit latent embeddings in adata.obsm and continue with standard Scanpy analysis.
Why scDLKit#
AnnData-native model training and evaluation
baseline-first deep-learning workflows for single-cell data
one shared CPU/GPU path with
device="auto"reproducible reports, plots, and tutorial notebooks
clean separation between model workflows and Scanpy analysis
release hardening driven by built-in benchmark gates before defaults change
gene-expression scope first, while defaults and tutorials are still being hardened
Example outputs#
Latent UMAP from the Scanpy PBMC quickstart. A healthy quickstart run should separate the major PBMC populations into broad regions rather than collapsing into a single mixed cloud.#
Latent embedding produced by the first end-to-end scDLKit walkthrough.#
Benchmark comparison chart from the PBMC model-comparison tutorial.#
Recommended learning path#
Install the tutorial dependencies from PyPI.
Run the Scanpy PBMC quickstart notebook in the default
quickstartprofile.Switch that notebook to the
fullprofile when you want a longer run and stronger qualitative separation.Continue with the model comparison notebook, which treats
PCAas the classical reference baseline.Use the classification notebook once you want a supervised baseline.
Keep the synthetic notebook only as a minimal smoke or fallback path.