全部 标题 作者
关键词 摘要

OALib Journal期刊
ISSN: 2333-9721
费用:99美元

查看量下载量

相关文章

更多...
Biophysics  2022 

基于自然语言处理的单细胞转录组数据伪时间分析
Pseudo-Time Analysis of Single-Cell Transcriptome Data Based on Natural Language Processing

DOI: 10.12677/BIPHY.2022.102004, PP. 31-38

Keywords: 单细胞测序,伪时间轨迹推断,自然语言处理,基因组学
Single-Cell Sequencing
, Pseudo-Time Trajectory Inference, Natural Language Processing, Genomics

Full-Text   Cite this paper   Add to My Lib

Abstract:

针对单细胞转录组测序数据,人们已经提出了各种强大的分析模型和处理算法,用于细胞聚类、细胞类型识别、细胞伪时间轨迹推断、细胞RNA动力学、基因调控网络推断和RNA速度分析等。本文提出一种方法,将自然语言处理技术引入单细胞转录组数据分析中。算法首先采用TF-IDF表示转录组基因表达强度对细胞功能的影响程度,进一步把细胞演化发育过程所形成的各种基因表达变化,理解为自然语言中的各种句子文本,创新性地把自然语言文本分析技术应用于单细胞转录组演化发育的处理。通过在基因网络上随机行走生成各种基因序列文本,从而生成基因空间中基因的嵌入式词向量表示和细胞的嵌入式词向量表示,实现了对单细胞转录组数据的伪时间可视化分析。最后的分析结果表明该模型对于单细胞数据进行细胞发育伪时间分析是一种有效的方法。
For single-cell transcriptome sequencing data, various powerful analytical models and processing algorithms have been proposed for cell clustering, cell type recognition, cell pseudo-time trajectory inference, cellular RNA dynamics, gene regulatory network inference, and RNA velocity analysis. This paper proposes an innovative approach to introducing natural language processing techniques into single-cell transcriptome data analysis. The algorithm first uses TF-IDF to indicate the degree of influence of transcriptome gene expression intensity on cell function, and further innovatively treats the various gene expression changes formed by the process of cell evolution and development as various sentence texts in natural language. Then, the natural language text analysis can be applied for the processing of evolutionary development of single-cell transcriptomes. Various gene sequence texts are generated by random walking process on the gene network, which generates the embedded word vector representation of genes and the embedded word vector representation of cells in the gene space, respectively. Finally, the pseudo-time visual analysis is considered for the single-cell transcriptome data. The final analysis results show that this model is an effective method for pseudo-time analysis of cell development for single-cell data.

References

[1]  Tang, F., Barbacioruet, C., Wang, Y., et al. (2009) mRNA-Seq Whole-Transcriptome Analysis of a Single Cell. Nat Methods, 6, 377-382.
https://doi.org/10.1038/nmeth.1315
[2]  Owens, B. (2012) Genomics: The Single Life. Na-ture, 491, 27-29.
https://doi.org/10.1038/491027a
[3]  Potter, S.S. (2018) Single-Cell RNA Sequencing for the Study of Development, Physiology and Disease. Nature Reviews Nephrology, 14, 479-492.
https://doi.org/10.1038/s41581-018-0021-7
[4]  Baslan, T. and Hicks, J. (2017) Unravelling Biology and Shifting Paradigms in Cancer with Single-Cell Sequencing. Nature Reviews Cancer, 17, 557-569.
https://doi.org/10.1038/nrc.2017.58
[5]  Kester, L. and van Oudenaarden, A. (2018) Single-Cell Transcriptomics Meets Lineage Tracing. Cell Stem Cell, 23, 166-179.
https://doi.org/10.1016/j.stem.2018.04.014
[6]  Papalexi, E. and Satija, R. (2018) Single-Cell RNA Sequencing to Explore Immune Cell Heterogeneity. Nature Reviews Immunology, 18, 35-45.
https://doi.org/10.1038/nri.2017.76
[7]  Carter, B. and Zhao, K. (2021) The Epigenetic Basis of Cellular Heterogeneity. Nature Reviews Genetics, 22, 235-250.
https://doi.org/10.1038/s41576-020-00300-0
[8]  Woyke, T., D.F.R. Doud, and F. Schulz (2017) The Trajectory of Microbial Single-Cell Sequencing. Nature Methods, 14, 1045-1054.
https://doi.org/10.1038/nmeth.4469
[9]  Sade-Feldman, M., Yizhak, K., Nordman, E., et al. (2018) Defining T Cell States Associated with Response to Checkpoint Immunotherapy in Melanoma. Cell, 175, 998-1013.e20.
https://doi.org/10.1016/j.cell.2018.10.038
[10]  Mathys, H., Davila-Velderrain, J., Peng, Z., et al. (2019) Single-Cell Transcriptomic Analysis of Alzheimer’s Disease. Nature, 570, 332-337.
https://doi.org/10.1038/s41586-019-1195-2
[11]  Su, Y., Chen, D., Yuan, D., et al. (2020) Multi-Omics Resolves a Sharp Disease-State Shift between Mild and Moderate COVID-19. Cell, 183, 1479-1495.e20.
https://doi.org/10.1016/j.cell.2020.10.037
[12]  Maier, B., Leader, A.M., Chen, S.T., et al. (2020) A Conserved Dendritic-Cell Regulatory Program Limits Antitumour Immunity. Nature, 580, 257-262.
https://doi.org/10.1038/s41586-020-2134-y
[13]  Bocchi, V.D., Conforti, P., Vezzoli, E., et al. (2021) The Coding and Long Noncoding Single-Cell Atlas of the Developing Human Fetal Striatum. Science, 372, Article No. abf5759.
https://doi.org/10.1126/science.abf5759
[14]  Bhaduri, A., Sandoval-Espinosa, C., Otero-Garcia, M., et al. (2021) An Atlas of Cortical Arealization Identifies Dynamic Molecular Signatures. Nature, 598, 200-204.
https://doi.org/10.1038/s41586-021-03910-8
[15]  Hu, H., Liu, R., Zhao, C., et al. (2022) CITEMO(XMBD): A Flexible Single-Cell Multimodal Omics Analysis Framework to Reveal the Heterogeneity of Immune cells. RNA Biology, 19, 290-304.
https://doi.org/10.1080/15476286.2022.2027151
[16]  Saelens, W., Cannoodt, R., Todorov, H. and Saeys, Y. (2019) A Comparison of Single-Cell Trajectory Inference Methods. Nature Biotechnology, 37, 547-554.
https://doi.org/10.1038/s41587-019-0071-9
[17]  Haghverdi, L., Büttner, M., Wolf, F.A., Buettner, F. and Theis, F.J. (2016) Diffusion Pseudotime Robustly Reconstructs Lineage Branching. Nature Methods, 13, 845-848.
https://doi.org/10.1038/nmeth.3971
[18]  Setty, M., Tadmor, M.D., Reich-Zeliger, S., et al. (2016) Wishbone Iden-tifies Bifurcating Developmental Trajectories from Single-Cell Data. Nature Biotechnology, 34, 637-645.
https://doi.org/10.1038/nbt.3569
[19]  Qiu, X., Mao, Q., Tang, Y., et al. (2017) Reversed Graph Embedding Re-solves Complex Single-Cell Trajectories. Nature Methods, 14, 979-982.
https://doi.org/10.1038/nmeth.4402
[20]  Setty, M., Kiseliovas, V., Levine, J., Gayoso, A., Mazutis, L. and Pe’er, D. (2019) Characterization of Cell Fate Probabilities in Single-Cell Data with Palantir. Nature Biotechnology, 37, 451-460.
https://doi.org/10.1038/s41587-019-0068-4
[21]  Cong, Y., Chan, Y.B. and Ragan, M.A. (2016) Exploring Lateral Genetic Transfer among Microbial Genomes Using TF-IDF. Scientific Reports, 6, Article No. 29319.
https://doi.org/10.1038/srep29319
[22]  Moussa, M. and Mandoiu, I.I. (2018) Single Cell RNA-seq Data Clustering Using TF-IDF Based Methods. BMC Genomics, 19, Article No. 569.
https://doi.org/10.1186/s12864-018-4922-4
[23]  Wu, F., Zhang, C. and Zhang, L. (2021) A Deep Learning Framework Combined with Word Embedding to Identify DNA Replication Origins. Scientific Reports, 11, Article No. 844.
https://doi.org/10.1038/s41598-020-80670-x
[24]  Stassen, S.V., Yip, G.G.K., Wong, K.K.Y., Ho, J.W.K. and Tsia, K.K. (2021) Generalized and Scalable Trajectory Inference in Single-Cell Omics Data with VIA. Nature Commu-nications, 12, Article No. 5528.
https://doi.org/10.1038/s41467-021-25773-3
[25]  Moon, K.R., van Dijk, D., Wang, Z., et al. (2019) Visualizing Structure and Transitions in High-Dimensional Biological Data. Nature Biotechnology, 37, 1482-1492.
https://doi.org/10.1038/s41587-019-0336-3

Full-Text

Contact Us

[email protected]

QQ:3279437679

WhatsApp +8615387084133