
- Select a language for the TTS:
- UK English Female
- UK English Male
- US English Female
- US English Male
- Australian Female
- Australian Male
- Language selected: (auto detect) - EN
Play all audios:
ABSTRACT Recent advances in single-cell technologies, including single-cell ATAC-seq (scATAC-seq), have enabled large-scale profiling of the chromatin accessibility landscape at the
single-cell level. However, the characteristics of scATAC-seq data, including high sparsity and high dimensionality, have greatly complicated the computational analysis. Here, we propose
scDEC, a computational tool for scATAC-seq analysis with deep generative neural networks. scDEC is built on a pair of generative adversarial networks, and is capable of simultaneously
learning the latent representation and inferring cell labels. In a series of experiments, scDEC demonstrates superior performance over other tools in scATAC-seq analysis across multiple
datasets and experimental settings. In downstream applications, we demonstrate that the generative power of scDEC helps to infer the trajectory and intermediate state of cells during
differentiation and the latent features learned by scDEC can potentially reveal both biological cell types and within-cell-type variations. We also show that it is possible to extend scDEC
for the integrative analysis of multi-modal single cell data. Access through your institution Buy or subscribe This is a preview of subscription content, access via your institution ACCESS
OPTIONS Access through your institution Access Nature and 54 other Nature Portfolio journals Get Nature+, our best-value online-access subscription $29.99 / 30 days cancel any time Learn
more Subscribe to this journal Receive 12 digital issues and online access to articles $119.00 per year only $9.92 per issue Learn more Buy this article * Purchase on SpringerLink * Instant
access to full article PDF Buy now Prices may be subject to local taxes which are calculated during checkout ADDITIONAL ACCESS OPTIONS: * Log in * Learn about institutional subscriptions *
Read our FAQs * Contact customer support SIMILAR CONTENT BEING VIEWED BY OTHERS SIMULTANEOUS DIMENSIONALITY REDUCTION AND INTEGRATION FOR SINGLE-CELL ATAC-SEQ DATA USING DEEP LEARNING
Article Open access 23 February 2022 MULTI-BATCH SINGLE-CELL COMPARATIVE ATLAS CONSTRUCTION BY DEEP LEARNING DISENTANGLEMENT Article Open access 12 July 2023 SCICAN: SINGLE-CELL CHROMATIN
ACCESSIBILITY AND GENE EXPRESSION DATA INTEGRATION VIA CYCLE-CONSISTENT ADVERSARIAL NETWORK Article Open access 12 September 2022 DATA AVAILABILITY The InSilico dataset was collected from
the GEO database with accession number GSE65360. The mouse Forebrain dataset was downloaded from the GEO database with accession number GSE100033. The Splenocyte dataset can be accessed at
ArrayExpress database with accession number E-MTAB-6714. The All blood dataset can be accessed at the GEO database with accession number GSE96772. The mouse atlas data are available at
http://atlas.gs.washington.edu/mouse-atac. The human PBMCs dataset used in multi-modal single cell analysis was downloaded from 10x Genomics
(https://support.10xgenomics.com/single-cell-multiome-atac-gex) with entry ‘pbmc_granulocyte_sorted_10k’. The preprocessed scATAC-seq data used as input for scDEC model in this study can be
downloaded from https://doi.org/10.5281/zenodo.397785856. CODE AVAILABILITY scDEC is open-source software based on the TensorFlow library57, which is available on Github
(https://github.com/kimmo1019/scDEC) and Zenodo (https://doi.org/10.5281/zenodo.4560834)58. A CodeOcean capsule with several example datasets is available at
https://codeocean.com/capsule/0746056/tree/v159. The pretrained models on both benchmark single-cell datasets and 10x Genomics PBMCs multi-modal single-cell dataset were provided. REFERENCES
* Klemm, S. L., Shipony, Z. & Greenleaf, W. J. Chromatin accessibility and the regulatory epigenome. _Nat. Rev. Genet._ 20, 207–220 (2019). Article Google Scholar * Corces, M. R. et
al. The chromatin accessibility landscape of primary human cancers. _Science_ 362, eaav1898 (2018). Article Google Scholar * Stuart, T. & Satija, R. Integrative single-cell analysis.
_Nat. Rev. Genet._ 20, 257–272 (2019). Article Google Scholar * Cusanovich, D. A. et al. Multiplex single-cell profiling of chromatin accessibility by combinatorial cellular indexing.
_Science_ 348, 910–914 (2015). Article Google Scholar * Buenrostro, J. D. et al. Single-cell chromatin accessibility reveals principles of regulatory variation. _Nature_ 523, 486–490
(2015). Article Google Scholar * Chen, H. et al. Assessment of computational methods for the analysis of single-cell ATAC-seq data. _Genome Biol._ 20, 241 (2019). Article Google Scholar
* Zamanighomi, M. et al. Unsupervised clustering and epigenetic classification of single cells. _Nat. Commun._ 9, 2410 (2018). Article Google Scholar * González-Blas, C. B. et al.
cisTopic: cis-regulatory topic modeling on single-cell ATAC-seq data. _Nat. Methods_ 16, 397–400 (2019). Article Google Scholar * Cusanovich, D. A. et al. A single-cell atlas of in vivo
mammalian chromatin accessibility. _Cell_ 174, 1309–1324.e1318 (2018). Article Google Scholar * Baker, S. M., Rogerson, C., Hayes, A., Sharrocks, A. D. & Rattray, M. Classifying cells
with Scasat, a single-cell ATAC-seq analysis tool. _Nucleic Acids Res._ 47, e10 (2019). Article Google Scholar * Fang, R. et al. Comprehensive analysis of single cell ATAC-seq data with
SnapATAC. _Nat. Commun._ 12, 1337 (2021). Article Google Scholar * Goodfellow, I. et al. Generative adversarial nets. In _Proceedings of Advances in Neural Information Processing Systems_
(_NeurIPS_) 2672–2680 (NIPS, 2014). * Kingma, D. P. & Welling, M. Auto-encoding variational bayes. In _Proceedings of International Conference on Learning Representations_ (ICLR, 2014).
* Liu, Q., Lv, H. & Jiang, R. hicGAN infers super resolution Hi-C data with generative adversarial networks. _Bioinformatics_ 35, i99–i107 (2019). Article Google Scholar * Xiong, L. et
al. SCALE method for single-cell ATAC-seq analysis via latent feature extraction. _Nat. Commun._ 10, 4576 (2019). Article Google Scholar * Zhu, J.-Y., Park, T., Isola, P. & Efros, A.
A. Unpaired image-to-image translation using cycle-consistent adversarial networks. In _Proceedings of the IEEE International Conference on Computer Vision_ 2223–2232 (ICCV, 2017). * Liu,
Q., Xu, J., Jiang, R. & Wong, W. H. Density estimation using deep generative neural networks. _Proc. Natl Acad. Sci. USA_ 118, e2101344118 (2021). Article Google Scholar * van der
Maaten, L. & Hinton, G. Visualizing data using t-SNE. _J. Mach. Learn. Res._ 9, 2579–2605 (2008). MATH Google Scholar * McInnes, L., Healy, J. & Melville, J. UMAP: uniform manifold
approximation and projection. _J. Open Source Software_ 3, 861 (2018). Article Google Scholar * Blondel, V. D., Guillaume, J.-L., Lambiotte, R. & Lefebvre, E. Fast unfolding of
communities in large networks. _J. Stat. Mech._ 2008, P10008 (2008). Article MATH Google Scholar * Preissl, S. et al. Single-nucleus analysis of accessible chromatin in developing mouse
forebrain reveals cell-type-specific transcriptional regulation. _Nat. Neurosci._ 21, 432–439 (2018). Article Google Scholar * Chen, X., Miragaia, R. J., Natarajan, K. N. & Teichmann,
S. A. A rapid and robust method for single cell chromatin accessibility profiling. _Nat. Commun._ 9, 5345 (2018). Article Google Scholar * Buenrostro, J. D. et al. Integrated single-cell
analysis maps the continuous regulatory landscape of human hematopoietic differentiation. _Cell_ 173, 1535–1548 (2018). Article Google Scholar * Schep, A. N., Wu, B., Buenrostro, J. D.
& Greenleaf, W. J. chromVAR: inferring transcription-factor-associated accessibility from single-cell epigenomic data. _Nat. Methods_ 14, 975–978 (2017). Article Google Scholar *
Mathelier, A. et al. JASPAR 2016: a major expansion and update of the open-access database of transcription factor binding profiles. _Nucleic Acids Res._ 44, D110–115 (2016). Article Google
Scholar * Shaltouki, A., Peng, J., Liu, Q., Rao, M. S. & Zeng, X. Efficient generation of astrocytes from human pluripotent stem cells in defined conditions. _Stem Cells_ 31, 941–952
(2013). Article Google Scholar * Bayam, E. et al. Genome-wide target analysis of NEUROD2 provides new insights into regulation of cortical projection neuron migration and differentiation.
_BMC Genomics_ 16, 681 (2015). Article Google Scholar * Owa, T. et al. Meis1 coordinates cerebellar granule cell development by regulating Pax6 transcription, BMP signaling and Atoh1
degradation. _J. Neurosci._ 38, 1277–1294 (2018). Article Google Scholar * Hallonet, M., Hollemann, T., Pieler, T. & Gruss, P. _Vax1_, a novel homeobox-containing gene, directs
development of the basal forebrain and visual system. _Genes Dev._ 13, 3106–3114 (1999). Article Google Scholar * Cesari, F. et al. Mice deficient for the Ets transcription factor Elk-1
show normal immune responses and mildly impaired neuronal gene activation. _Mol. Cell. Biol._ 24, 294–305 (2004). Article Google Scholar * Stolt, C. C. et al. The Sox9 transcription factor
determines glial fate choice in the developing spinal cord. _Genes Dev._ 17, 1677–1689 (2003). Article Google Scholar * Street, K. et al. Slingshot: cell lineage and pseudotime inference
for single-cell transcriptomics. _BMC Genomics_ 19, 477 (2018). Article Google Scholar * Iwasaki, H. & Akashi, K. Myeloid lineage commitment from the hematopoietic stem cell.
_Immunity_ 26, 726–740 (2007). Article Google Scholar * Gilmour, J. et al. A crucial role for the ubiquitously expressed transcription factor Sp1 at early stages of hematopoietic
specification. _Development_ 141, 2391–2401 (2014). Article Google Scholar * Anderson, K. C. et al. Expression of human B cell-associated antigens on leukemias and lymphomas: a model of
human B cell differentiation. _Blood_ 63, 1424–1433 (1984). * Villani, A.-C. et al. Single-cell RNA-seq reveals new types of human blood dendritic cells, monocytes, and progenitors.
_Science_ 356, eaah4573 (2017). Article Google Scholar * Argelaguet, R. et al. MOFA+: a statistical framework for comprehensive integration of multi-modal single-cell data. _Genome Biol._
21, 111 (2020). Article Google Scholar * Jin, S., Zhang, L. & Nie, Q. scAI: an unsupervised approach for the integrative analysis of parallel single-cell transcriptomic and epigenomic
profiles. _Genome Biol._ 21, 25 (2020). Article Google Scholar * Stuart, T. et al. Comprehensive integration of single-cell data. _Cell_ 177, 1888–1902.e1821 (2019). Article Google
Scholar * Korsunsky, I. et al. Fast, sensitive and accurate integration of single-cell data with Harmony. _Nat. Methods_ 16, 1289–1296 (2019). * Teller, V. Speech and language processing:
an introduction to natural language processing, computational linguistics, and speech recognition. _Comput. Linguist._ 26, 638–641 (2000). Article Google Scholar * Chowdhury, G. G.
_Introduction to Modern Information Retrieval_ (Facet, 2010). * Halko, N., Martinsson, P.-G. & Tropp, J. A. Finding structure with randomness: probabilistic algorithms for constructing
approximate matrix decompositions. _SIAM Rev._ 53, 217–288 (2011). Article MathSciNet MATH Google Scholar * Pedregosa, F. et al. Scikit-learn: machine learning in Python. _J. Mach.
Learn. Res._ 12, 2825–2830 (2011). MathSciNet MATH Google Scholar * Gulrajani, I., Ahmed, F., Arjovsky, M., Dumoulin, V. & Courville, A. C. Improved training of Wasserstein GANs. In
_Proceedings of Advances in Neural Information Processing Systems_ 5767–5777 (NIPS, 2017). * Yi, Z., Zhang, H., Tan, P. & Gong, M. Dualgan: Unsupervised dual learning for image-to-image
translation. In _Proceedings of the IEEE International Conference on Computer Vision_ 2849–2857 (ICCV, 2017). * Kingma, D. P. & Ba, J. Adam: a method for stochastic optimization. In
_Proceedings of International Conference on Learning Representations_ (ICLR, 2014). * Mukherjee, S., Asnani, H., Lin, E. & Kannan, S. In _Proceedings of the AAAI Conference on Artificial
Intelligence_ Vol. 33, 4610–4617 (AAAI, 2019). * Ioffe, S. & Szegedy, C. Batch normalization: accelerating deep network training by reducing internal covariate shift. In _Proceedings of
the 32nd International Conference on Machine Learning_ 448–456 (ICML, 2015). * Strehl, A. & Ghosh, J. Cluster ensembles—a knowledge reuse framework for combining multiple partitions.
_J. Mach. Learn. Res._ 3, 583–617 (2002). MathSciNet MATH Google Scholar * Hubert, L. & Arabie, P. Comparing partitions. _J. Classification_ 2, 193–218 (1985). Article MATH Google
Scholar * Rosenberg, A. & Hirschberg, J. V-measure: A conditional entropy-based external cluster evaluation measure. In _Proceedings of the 2007 Joint Conference on Empirical Methods in
Natural Language Processing and Computational Natural Language Learning_ 410–420 (EMNLP-CoNLL, 2007). * Rand, W. M. Objective criteria for the evaluation of clustering methods. _J. Am.
Stat. Assoc._ 66, 846–850 (1971). Article Google Scholar * Tibshirani, R., Walther, G. & Hastie, T. Estimating the number of clusters in a data set via the gap statistic. _J. R. Stat.
Soc. B_ 63, 411–423 (2001). Article MathSciNet MATH Google Scholar * Mann, H. B. & Whitney, D. R. On a test of whether one of two random variables is stochastically larger than the
other. _Ann. Math. Stat._ 18, 50–60 (1947). * Liu, Q. et al. scDEC: data for simultaneous deep generative modeling and clustering of single cell genomic data. _Zenodo_
https://doi.org/10.5281/zenodo.3984189 (2020). * Abadi, M. et al. Tensorflow: a system for large-scale machine learning. In _Proceedings of 12th USENIX Symposium on Operating Systems Design
and Implementation_ 265–283 (OSDI, 2016). * Liu, Q. et al. scDEC: code for simultaneous deep generative modeling and clustering of single cell genomic data. _Zenodo_
https://doi.org/10.5281/zenodo.4560834 (2021). * Liu, Q. et al. scDEC: simultaneous deep generative modeling and clustering of single cell genomic data. _CodeOcean_
https://doi.org/10.24433/CO.3347162.v1 (2020). Download references ACKNOWLEDGEMENTS This work was supported by NIH grants R01 HG010359 (W.H.W.) and P50 HG007735 (W.H.W.). This work was also
supported by the National Key Research and Development Program of China no. 2018YFC0910404 (R.J.), the National Natural Science Foundation of China nos 61873141 (R.J.), 61721003 (R.J.) and
61573207 (R.J.). AUTHOR INFORMATION AUTHORS AND AFFILIATIONS * Ministry of Education Key Laboratory of Bioinformatics, Research Department of Bioinformatics at the Beijing National Research
Center for Information Science and Technology, Center for Synthetic and Systems Biology, Department of Automation, Tsinghua University, Beijing, China Qiao Liu, Shengquan Chen & Rui
Jiang * Department of Statistics, Stanford University, Stanford, CA, USA Qiao Liu & Wing Hung Wong * Department of Biomedical Data Science, Bio-X Program, Center for Personal Dynamic
Regulomes, Stanford University, Stanford, CA, USA Wing Hung Wong Authors * Qiao Liu View author publications You can also search for this author inPubMed Google Scholar * Shengquan Chen View
author publications You can also search for this author inPubMed Google Scholar * Rui Jiang View author publications You can also search for this author inPubMed Google Scholar * Wing Hung
Wong View author publications You can also search for this author inPubMed Google Scholar CONTRIBUTIONS W.H.W., R.J. and Q.L. conceived the study. Q.L. designed and implemented scDEC. Q.L.,
S.C. and W.H.W. performed the data analysis. Q.L. and W.H.W. interpreted the results. Q.L., R.J. and W.H.W. wrote the manuscript. CORRESPONDING AUTHORS Correspondence to Rui Jiang or Wing
Hung Wong. ETHICS DECLARATIONS COMPETING INTERESTS The authors declare no competing interests. ADDITIONAL INFORMATION PEER REVIEW INFORMATION _Nature Machine Intelligence_ thanks the
anonymous reviewers for their contribution to the peer review of this work. PUBLISHER’S NOTE Springer Nature remains neutral with regard to jurisdictional claims in published maps and
institutional affiliations. SUPPLEMENTARY INFORMATION SUPPLEMENTARY INFORMATION Supplementary Figs. 1–18 and Tables 1–6. REPORTING SUMMARY RIGHTS AND PERMISSIONS Reprints and permissions
ABOUT THIS ARTICLE CITE THIS ARTICLE Liu, Q., Chen, S., Jiang, R. _et al._ Simultaneous deep generative modelling and clustering of single-cell genomic data. _Nat Mach Intell_ 3, 536–544
(2021). https://doi.org/10.1038/s42256-021-00333-y Download citation * Received: 14 August 2020 * Accepted: 14 March 2021 * Published: 10 May 2021 * Issue Date: June 2021 * DOI:
https://doi.org/10.1038/s42256-021-00333-y SHARE THIS ARTICLE Anyone you share the following link with will be able to read this content: Get shareable link Sorry, a shareable link is not
currently available for this article. Copy to clipboard Provided by the Springer Nature SharedIt content-sharing initiative