Download options

You can obtain Portal data either by downloading a single project, creating a custom dataset with a selection of projects and/or samples, or by choosing one of the Portal-wide download options. For full information about the files you can download from the Portal, see the Downloadable Files page.

This page describes the different options available to you when downloading Portal data.

Data format

We provide all single-cell and single-nuclei expression data in both SingleCellExperiment objects (.rds files) for use in R, or as AnnData objects (.h5ad files) for use in Python. The default format for all samples on the Portal with single-cell and single-nuclei expression is set to SingleCellExperiment (R). You can learn more about using these object types from our FAQ sections on using the provided RDS files and using the provided H5AD files.

Only one data format is currently supported for a single download, including when downloading custom datasets. To obtain data in both SingleCellExperiment and AnnData formats, you will need to download these file formats separately.

The only exception to this rule is when downloading spatial transcriptomics data, as the only data format available will be Spaceranger. Spatial data can be coupled with single-cell data in either the SingleCellExperiment or AnnData format, but not with both.

In addition, note that expression data for multiplexed libraries is only available in SingleCellExperiment format, as described here.

Modalities

Besides single-cell/nuclei expression, many samples in the Portal have additional sequencing modalities including CITE-seq, spatial transcriptomics, and bulk RNA-seq.

In particular, there are two modality options that you may see when creating a custom dataset to download or when downloading a full project with the “Download Now” button: Single-cell and Spatial.

By default, the Single-cell modality will be selected for all single-cell and single-nuclei RNA-seq samples and/or projects. Selecting this download option will provide you with the gene expression data from single-cell or single-nuclei samples and/or projects. If available, CITE-seq expression data will also be included.

If a sample or project has spatial transcriptomic data, you will also have the option to select the Spatial modality for download. Selecting Spatial will provide you with the spatial transcriptomic data only.

If you are creating a custom dataset that contains samples and/or projects with bulk RNA-seq data, you will have the option to include this data in your download as well. Note that the bulk RNA-seq expression file will always include all samples from the given project with bulk expression, even if you are only downloading a subset of that project’s samples. If you are using the “Download now” button to download a full project that contains bulk RNA-seq expression, it will automatically be included with the download.

For more information about the expected file download structure for Single-cell and Spatial modalities, refer to our Downloadable files.

Merged objects

When downloading a project, either by using Download Now or Add to Dataset, you will have the option to either receive the data as objects for individual libraries, or as a single merged object with data from all samples in the given project. Please be aware that merged objects have not been integrated or batch-corrected. Refer to this documentation for the contents of a merged object download specifically. Note that this applies only to Single-cell modality downloads, not Spatial.

When creating a custom dataset to download, you will be able to select the option to merge all samples only if you have included all samples from the given project in My Dataset. Merging a subset of samples in a project is not currently supported. In addition, merged objects are not available for all samples or projects, as described here.

Note that even when downloading data for all single-cell and single-nuclei samples on the Portal, merged objects will still be provided per-project. There will not be a merged object with all samples from all projects, but a single merged object for each project.

Multiplexed sample libraries

When downloading a project that contains multiplexed samples (see What is a multiplexed sample?), you will have the option to exclude multiplexed samples from the download. If selected, the download will contain expression data for only non-multiplexed samples. Note that, as described in our FAQ, AnnData objects (.h5ad files) are not available for multiplexed samples. In addition, you will not be able to select the option to merge samples into a single file if the project contains multiplexed samples.