About me
Hi, I’m Dongyuan, a new tenure track assistant professor in Department of Genetics and Genome Sciences, University of Connecticut, Health Center (UConn Health).
Previously, I got my Ph.D. in Bioinformatics, at University of California, Los Angeles (UCLA), under the supervision of Dr. Jingyi Jessica Li from Department of Statistics & Data Science. Previously, I received Master of Science in Computational Biology & Quantitative Genetics from Department of Biostatistcs, Harvard T.H. Chan School of Public Health, under supervision of Dr. Rafael Irizarry. I obtained Bachelor of Science in Biological Science from Fudan University, Shanghai, China, under supervision of Dr. Bao-Rong Lu.
I am looking for students and postdocs to work on computational genomics. Please email dosong@uchc.edu if you are interested in. Perspective student should first apply Ph.D. in Biomedical Science. The deadline of this year is 12/01/2024. Current students in Ph.D. in Biomedical Science are welcome for rotation.
My research focuses on developing computational tools in analyzing single-cell and spatial omics. Some of my previous works include:
- Probabilistic generative models of high-dimensional single-cell and spatial multi-omics data. I develped scDesign3 (Nature Biotechnology, 2024), an “all-in-one” multimodal single-cell and spatial omics simulator which summarizes the input real dataset into a parametric model. I also contributed to the development of scReadSim (Nature Communications, 2023), a single-cell RNA-seq and ATAC-seq read simulator, and scDesign2 (Genome Biology, 2021), the predecessor of scDesign3.
- Differential expression (DE) test and false discovery rate (FDR). We developed ClusterDE (bioRxiv, 2023), a post-clustering DE method controlling FDR under “double dipping (i.e., first clustering then DE between clusters)”. Previously, I developed PseudotimeDE (Genome Biology, 2021), a DE method for testing gene changes along cell pseudotime accounting for the uncertainty of pseudotime.
- Informative genes/cells selection in large-scale scRNA-seq data. We developed scPNMF (Bioinformatics, 2021), an informative gene selection method for selecting only a small number of genes. We also developed scSampler (Bioinformatics, 2022), a diveristy-preserving cell subsampling method for large-scale datasets.