TY - JOUR
T1 - Multi-omics integration in the age of million single-cell data
AU - Miao, Zhen
AU - Humphreys, Benjamin D.
AU - McMahon, Andrew P.
AU - Kim, Junhyong
N1 - Funding Information:
This work was supported in part by UC2DK126024 grant to J.K., B.D.H. and A.P.M. as well as by a Health Research Formula Fund of the Commonwealth of Pennsylvania, which did not have a direct role in the work.
Publisher Copyright:
© 2021, Springer Nature Limited.
PY - 2021/11
Y1 - 2021/11
N2 - An explosion in single-cell technologies has revealed a previously underappreciated heterogeneity of cell types and novel cell-state associations with sex, disease, development and other processes. Starting with transcriptome analyses, single-cell techniques have extended to multi-omics approaches and now enable the simultaneous measurement of data modalities and spatial cellular context. Data are now available for millions of cells, for whole-genome measurements and for multiple modalities. Although analyses of such multimodal datasets have the potential to provide new insights into biological processes that cannot be inferred with a single mode of assay, the integration of very large, complex, multimodal data into biological models and mechanisms represents a considerable challenge. An understanding of the principles of data integration and visualization methods is required to determine what methods are best applied to a particular single-cell dataset. Each class of method has advantages and pitfalls in terms of its ability to achieve various biological goals, including cell-type classification, regulatory network modelling and biological process inference. In choosing a data integration strategy, consideration must be given to whether the multi-omics data are matched (that is, measured on the same cell) or unmatched (that is, measured on different cells) and, more importantly, the overall modelling and visualization goals of the integrated analysis.
AB - An explosion in single-cell technologies has revealed a previously underappreciated heterogeneity of cell types and novel cell-state associations with sex, disease, development and other processes. Starting with transcriptome analyses, single-cell techniques have extended to multi-omics approaches and now enable the simultaneous measurement of data modalities and spatial cellular context. Data are now available for millions of cells, for whole-genome measurements and for multiple modalities. Although analyses of such multimodal datasets have the potential to provide new insights into biological processes that cannot be inferred with a single mode of assay, the integration of very large, complex, multimodal data into biological models and mechanisms represents a considerable challenge. An understanding of the principles of data integration and visualization methods is required to determine what methods are best applied to a particular single-cell dataset. Each class of method has advantages and pitfalls in terms of its ability to achieve various biological goals, including cell-type classification, regulatory network modelling and biological process inference. In choosing a data integration strategy, consideration must be given to whether the multi-omics data are matched (that is, measured on the same cell) or unmatched (that is, measured on different cells) and, more importantly, the overall modelling and visualization goals of the integrated analysis.
UR - http://www.scopus.com/inward/record.url?scp=85113196912&partnerID=8YFLogxK
U2 - 10.1038/s41581-021-00463-x
DO - 10.1038/s41581-021-00463-x
M3 - Review article
C2 - 34417589
AN - SCOPUS:85113196912
SN - 1759-5061
VL - 17
SP - 710
EP - 724
JO - Nature Reviews Nephrology
JF - Nature Reviews Nephrology
IS - 11
ER -