Integrative
single-
cell
analysis
single
Recent advances in molecular biology,microfluidics and nanotechnology have given rise to a multitude of single-cell sequencing technologies(Fig.1).Initial meth-ods have focused on measurements of a single modality(for example,DNA sequence,RNA expression or chro-matin accessibility).Although these technologies have yielded transformative insights into cellular diversity and development,this segregation is driven by methodo-logical convenience and limits the ability to derive a deep understanding of the relationships between biomole-cules in single cells.Understanding these interactions is key to deriving a deep understanding of the cellular state and remains a challenge for the field of single-cell analysis.Moreover,as the scale and availability of data sets rapidly grow,new computational methods are needed for normalization and joint analysis across sam-ples,even in the presence of significant batch effects or interindividual variation.Single-cell RNA sequencing(scRNA-seq)is one of the most widely used single-cell sequencing approaches,with a range of technologies for sensitive,highly multi-plexed or combinatorially barcoded profiling18.These advances have accompanied a variety of complementary single-cell genomic,epigenomic and proteomic profiling technologies,including methods for single-cell measure-ments of genome sequence9,10,chromatin accessibil-ity1115,DNA methylation11,1619,cell surface proteins20,21,small RNAs22,histone modifications23,24 and chromo-somal conformation25,26.Furthermore,recent efforts have pioneered methods to accurately record spatial or lineage information in single-cell studies2735(Fig.1;TAble1).An idealized experimental workflow would observe all aspects of the cell,including a full history of its molecular states,spatial positions and environmental interactions.Although outside the bounds of current technology,multimodal technologies and integrative computational methods enable us to move closer to this aspirational and exciting goal.In this Review,we describe the currently available methods for single-cell transcriptomics,genomics,epigenomics and proteom-ics with an emphasis on those methods that provide multimodal data or data that can be integrated into a multimodal analysis.We focus on the analysis of scRNA-seq data in conjunction with other data types,as these are currently the most commonly used and well-established methods.In particular,we discuss those methods capable of integrating data from the same indi-vidual cell wherever possible.We discuss these methods and their challenges in depth,as well as their potential applications and future directions.Multimodal single-cell measurementsSingle-cell molecular profiling technologies initially focused on the development of methods capable of accurately detecting a single aspect of the cell state,first with simple semiquantitative readouts36 and later utiliz-ing high-throughput DNA sequencing37.More recently,there has been considerable interest in the simultaneous profiling of multiple types of molecule within a single cell(multimodal profiling)to build a much more compre-hensive molecular view of the cell.Often,these meth-ods couple scRNA-seq with the measurement of another cellular characteristic,such as DNA sequence,protein abundance or epigenomic state.Multimodal data can be obtained from single cells using four broad strategies(Fig.2):first,the use of an initial non-destructive assay before sequencing;second,the separation of different cel-lular fractions for parallel experimental workflows;third,the experimental conversion of multimodal data into a common molecular format to enable the simultaneous detection of multiple data types via a common method-ology,such as DNA sequencing;and fourth,the analysis of different data types encoded in nucleotide sequences,such as RNA abundance and sequence polymorphisms.Single-cell RNA sequencing(scRNA-seq).Sequencing of cDNAs derived from RNA molecules(usually polyadenylated mRNAs)from a single cell.it is typically performed for many hundreds to thousands of cells in a single experiment.MultimodalData of multiple types,for example,of RNA and protein.Integrative single-cell analysisTimStuart 1 and RahulSatija 1,2*Abstract|The recent maturation of single-cell RNA sequencing(scRNA-seq)technologies has coincided with transformative new methods to profile genetic,epigenetic,spatial,proteomic and lineage information in individual cells.This provides unique opportunities,alongside computational challenges,for integrative methods that can jointly learn across multiple types of data.Integrated analysis can discover relationships across cellular modalities,learn a holistic representation of the cell state,and enable the pooling of data sets produced across individuals and technologies.In this Review,we discuss the recent advances in the collection and integration of different data types at single-cell resolution with a focus on the integration of gene expression data with other types of single-cell measurement.1New York Genome Center,New York,NY,USA.2Center for Genomics and Systems Biology,New York University,New York,NY,USA.*e-mail:rsatija nygenome.orghttps:/doi.org/10.1038/s41576-019-0093-7 SINGLE-CELL OMICSREvIEWSNature reviews|GeneticsGathering cytometric information before a destructive assay.An initial and elegant solution for multimodal profiling involves the application of non-destructive cytometric measurements before the application of a destructive single-cell assay.As multiple scRNA-seq workflows utilize fluorescence-activated cell sort-ing(FACS)to deposit individual cells into micro titre plates2,3,38,it is a natural extension to combine this single-cell isolation with index sorting to gather addi-tional cytometric data about the cells before sequencing(Fig.2a).Whereas early studies combined measurements of the cell cycle and semiquantitative measurements of mRNAs from the same cell39,this approach has been particularly fruitful in immunology and haematology,where well-defined cell surface markers can be used to classify functional cell types and states40,41 or to enrich for rare cells in heterogeneous populations.Paul etal.41 and Nestorowa etal.42 applied this workflow to profile early murine haematopoietic progenitors,revealing the immunophenotypes of transcriptionally defined cell-type clusters.Similarly,Wilson etal.40 utilized FACS isolation of rare haematopoietic stem cells(HSCs)fol-lowed by scRNA-seq and functional assays to identify cell surface markers associated with cells that are able to consistently self-renew40.New methods that utilize arrays of picolitre wells have the potential to dramatically increase the scale of these experiments while retaining the ability to gather cytometric data before single-cell assays43.However,cytometric methods are fundamen-tally limited in the number of parameters they can meas-ure for each cell,as they are limited by spectral overlap between the fluorescent reporters.Separation of cellular components.Alternative approaches are required in order to measure aspects of the cell that cannot easily be read out through cellular fluorescence.This requirement is especially relevant for experiments aiming to simultaneously measure mRNAs alongside genomic DNA or intracellular protein in the same cell.In these cases,the physical separation or selec-tive tagging of different cellular fractions from single cells presents an attractive solution(Fig.2b).Several groups LineageStateTrajectoryPseudotime Monocle71,73 Wishbone74 Velocyto70 Diffusion72 scGESTALT32 ScarTrace33 LINNAEUS34 MEMOIR27Chromatinaccessibility scATAC-seq13 sciATAC-seq14 scTHS-seq15 10X GenomicsCell surface proteins CITE-seq20 REAP-seq21 FACS41,42Spatial position MERFISH107,108,109 smFISH102 STARmap31Histonemodifications scChIPseq23,24Genomesequence SNS9 SCI-seq10DNAmethylation scBS-seq17 snmC-seq16 sci-MET19Intracellularprotein PEA49,50mRNA Drop-seq4 InDrop5 Smart-seq238 MARS-seq3 10X Genomics6 SPLiT-seq8 sci-RNA-seq7Fig.1|Multimodal and integrative methods for single-cell analyses.An overview of the current methods for single-cell data integration is shown.A wide variety of single-cell methods have now been developed to measure a broad range of cellular parameters.These methods can be divided into those that determine the current state of the cell,those that determine the cell lineage,and computational methods that order cells along a pseudotemporal trajectory.CITE-seq,cellular indexing of transcriptomes and epitopes by sequencing;FACS,fluorescence-activated cell sorting;LINNAEUS,lineage tracing by nuclease-activated editing of ubiquitous sequences;MARS-seq,massively parallel RNA single-cell sequencing;MEMOIR,memory by engineered mutagenesis with optical insitu readout;MERFISH,multiplexed error-robust fluorescence insitu hybridization;PEA,proximity extension assay;REAP-seq,RNA expression and protein sequencing assay;scATAC-seq,single-cell assay for transposase-accessible chromatin using sequencing;scBS-seq,single-cell bisulfite sequencing;scChIPseq,single-cell chromatin immunoprecipitation followed by sequencing;scGESTALT,single-cell genome editing of synthetic target arrays for lineage tracing;sci-MET,single-cell combinatorial indexing for methylation analysis;sci-RNA-seq,single-cell combinatorial indexing RNA sequencing;SCI-seq,single-cell combinatorial indexed sequencing;sciATAC-seq,single-cell combinatorial indexing assay for transposase-accessible chromatin using sequencing;scTHS-seq,single-cell transposome hypersensitivity site sequencing;smFISH,single-molecule fluorescence insitu hybridization;snmC-seq,single-nucleus methylcytosine sequencing;SNS,single-nucleus sequencing;SPLiT-seq,split-pool ligation-based transcriptome sequencing;STARmap,spatially resolved transcript amplicon readout mapping.Index sortingFluorescence-activated sorting of cells into known plate 1|current experimental methods for unimodal and multimodal single-cell measurementsData typesMethod nameFeature throughputcell throughputRefsUnimodalmRNADrop-seqWhole transcriptome1,00010,0004InDropWhole transcriptome1,00010,000510X GenomicsWhole transcriptome1,00010,0006Smart-seq2Whole transcriptome10030038MARS-seqWhole transcriptome1003003CEL-seqWhole transcriptome1003001SPLiT-seqWhole transcriptome 50,0008sci-RNA-seqWhole transcriptome 50,0007Genome sequenceSNSWhole genome101009SCI-seqWhole genome10,00020,00010Chromatin accessibilityscATAC-seqWhole genome1,0002,00013sciATAC-seqWhole genome10,00020,00014scTHS-seqWhole genome10,00020,00015DNA methylationscBS-seqWhole genome52017snmC-seqWhole genome1,0005,00016sci-METWhole genome1,0005,00019scRRBSReduced representation genome11018Histone modificationsscChIPseqWhole genome+single modification1,00010,00024Chromosome conformationscHi-C-seqWhole genome11026MultimodalHistone modifications+spatialNASingle locus+single modification1010023mRNA+lineagescGESTALTWhole transcriptome1,00010,00032ScarTraceWhole transcriptome1,00010,00033LINNAEUSWhole transcriptome1,00010,00034Lineage+spatialMEMOIRNA1010027mRNA+spatialosmFISH1050 RNAs1,0005,00035STARmap201,000 RNAs10030,00031MERFISH1001,000 RNAs10040,000108seqFish125250 RNAs10020,00029mRNA+cell surface proteinCITE-seqWhole transcriptome+proteins1,00010,00020REAP-seqWhole transcriptome+proteins1,00010,00021mRNA+chromatin accessibilitysci-CARWhole transcriptome+whole genome1,00020,00048mRNA+DNA methylationscM&T-seqWhole genome5010046mRNA+genomic DNAG&T-seqWhole genome+whole transcriptome5020044mRNA+intracellular proteinNA96 mRNAs+38 proteins501005082 mRNAs+75 proteins5020049DNA methylation+chromatin accessibilityscNOMe-seqWhole genome102011CEL-seq,cell expression by linear amplification and sequencing;CITE-seq,cellular indexing of transcriptomes and epitopes by sequencing;G&T-seq,genome and transcriptome sequencing;LINNAEUS,lineage tracing by nuclease-activated editing of ubiquitous sequences;MARS-seq,massively parallel RNA single-cell sequencing;MEMOIR,memory by engineered mutagenesis with optical insitu readout;MERFISH,multiplexed error-robust fluorescence insitu hybridization;osmFISH,cyclic single-molecule fluorescence insitu hybridization;REAP-seq,RNA expression and protein sequencing assay;scATAC-seq,single-cell assay for transposase-accessible chromatin using sequencing;scBS-seq,single-cell bisulfite sequencing;scChIPseq,single-cell chromatin immunoprecipitation followed by sequencing;scGESTALT,single-cell genome editing of synthetic target arrays for lineage tracing;scHi-C-seq,a single-cell Hi-C method for chromosome conformation;sciATAC-seq,single-cell combinatorial indexing assay for transposase-accessible chromatin using sequencing;sci-CAR,single-cell combinatorial indexing chromatin accessibility and mRNA sequencing;sci-MET,single-cell combinatorial indexing for methylation analysis;sci-RNA-seq,single-cell combinatorial indexing RNA sequencing;SCI-seq,single-cell combinatorial indexed sequencing;scM&T-seq,single-cell methylome and transcriptome sequencing;scNOMe-seq,single-cell nucleosome occupancy and methylome sequencing;scRRBS,single-cell reduced representation bisulfite sequencing;scTHS-seq,single-cell transposome hypersensitivity site sequencing;seqFISH,sequential fluorescence insitu hybridization;snmC-seq,single-nucleus methylcytosine sequencing;SNS,single-nucleus sequencing;SPLiT-seq,split-pool ligation-based transcriptome sequencing;STARmap,spatially resolved transcript amplicon readout mapping.Nature reviews|GeneticsReviewsCas9AAAAAAAAAAAAsgRNA sequenceParameter 1Parameter 2scRNA-seqFACScabdCytosolNucleusMultimodal analysisLaserAAAAAAAAscRNA-seqscRNA-seqmRNAsLineage arrayAGCTATCCCTACTATACTGAGCTATCTCTACTATACTGAGCTATCTCTA-CTGLineageVelocityTranscriptabundanceGenomesequenceTranscript profileProteinRNAPerturbationEpitope taggingLineage tracingLineage treeMultimodal analysisRegulatory inferenceAAAAAAAAAAAACas9AAAALineage barcode sequenceAAAAAAAAmRNAsscRNA-seqAAAAAntibody barcode sequenceAAAAAAAAmRNAsUnspliced intronSomatic mutationscRNA-seqscBS-seqscDNA- now achieved parallel genome and transcriptome sequencing from the same cell either through the phys-ical separation of mRNA and genomic DNA using biotinylated oligo(dT)primers44 or the selective incor-poration of T7 promoter sequences into cDNAs allow-ing subsequent selective amplification of cDNAs over genomic DNA through invitro transcription45.This has allowed a direct association between genotype and gene expression to be made,revealing that DNA copy num-ber variations and chromosomal rearrangements may explain some of the variability in mRNA abundance between individual cells.These methods will be of par-ticular interest for tissues with high levels of somatic genetic variation,such as tumours.Building on methods established by Macaulay etal.44 and the single-cell bisulfite sequencing methods pio-neered by Smallwood etal.17,a sodium bisulfite treat-ment step before PCR amplification of the genomic DNA fraction has allowed the capture of single-cell DNA methylation pat