|

BEENE: Deep Learning based Nonlinear Embedding Improves Batch Effect Estimation.

Researchers

Journal

Modalities

Models

Abstract

Analyzing large-scale single-cell transcriptomic datasets generated using different technologies is challenging due to the presence of batch-specific systematic variations known as batch effects. Since biological and technological differences are often interspersed, detecting and accounting for batch effects in RNA-seq datasets are critical for effective data integration and interpretation. Low-dimensional embeddings, such as principal component analysis (PCA) are widely used in visual inspection and estimation of batch effects. Linear dimensionality reduction methods like PCA are effective in assessing the presence of batch effects, especially when batch effects exhibit linear patterns. However, batch effects are inherently complex and existing linear dimensionality reduction methods could be inadequate and imprecise in the presence of sophisticated non-linear batch effects.We present BEENE, a deep non-linear auto-encoder network which is specially tailored to generate an alternative lower dimensional embedding suitable for both linear and non-linear batch effects. BEENE simultaneously learns the batch and biological variables from RNA-seq data, resulting in an embedding that is more robust and sensitive than PCA embedding in terms of detecting and quantifying batch effects. BEENE was assessed on a collection of carefully controlled simulated datasets as well as biological datasets, including two technical replicates of mouse embryogenesis cells, peripheral blood mononuclear cells from three largely different experiments and five studies of pancreatic islet cells.BEENE is freely available as an open source project at https://github.com/ashiq24/BEENE.Supplementary material SM.pdf.© The Author(s) 2023. Published by Oxford University Press.

Similar Posts

Leave a Reply

Your email address will not be published. Required fields are marked *