API Reference#

This section provides detailed documentation for all classes and functions in scDataset.

Main Dataset Class#

scDataset

Iterable PyTorch Dataset for on-disk data collections with flexible sampling strategies.

Multi-Modal Data Support#

MultiIndexable

Container for multiple indexable objects that should be indexed together.

Sampling Strategies#

SamplingStrategy

Abstract base class for sampling strategies.

Streaming

Sequential streaming sampling strategy with optional buffer-level shuffling.

BlockShuffling

Block-based shuffling sampling strategy.

BlockWeightedSampling

Weighted sampling with block-based shuffling.

ClassBalancedSampling

Class-balanced sampling with automatic weight computation.